__NR_sys_kexec_load should be __NR_kexec_load. Mainly affects users of the
_syscallN() macros, and glibc is already checking for __NR_kexec_load.
Cc: Ulrich Drepper <drepper@redhat.com>
Cc: "Eric W. Biederman" <ebiederm@xmission.com>
Cc: Mikael Starvik <starvik@axis.com>
Cc: David Howells <dhowells@redhat.com>
Cc: Yoshinori Sato <ysato@users.sourceforge.jp>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: Paul Mundt <lethal@linux-sh.org>
Cc: Kazumoto Kojima <kkojima@rr.iij4u.or.jp>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
A couple of /proc/vmcore data structures overflow with 32bit systems having
memory more than 4G. This patch fixes those.
Signed-off-by: Ken'ichi Ohmichi <oomichi@mxs.nes.nec.co.jp>
Signed-off-by: Vivek Goyal <vgoyal@in.ibm.com>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Before commit 47e65328a7, next_thread() took
a const task_t. Reinstate the const qualifier, getting the next thread
never changes the current thread.
Signed-off-by: Keith Owens <kaos@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We changed the wrong symbol. It's tty_insert_flip_string_flags() which is
called from the previously-non-GPL'ed now-inlined tty_insert_flip_char().
Fix that up, and uninline tty_schedule_flip() while we're there.
Cc: Tobias Powalowski <t.powa@gmx.de>
Cc: Alan Cox <alan@lxorguk.ukuu.org.uk>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Lay out the structure definitions in include/linux/leds.h to be aligned as
much as possible. Also minor updates to the comments to make them more
concise.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
GPIO LED support for Samsung S3C24XX SoC series processors.
Signed-off-by: Ben Dooks <ben-linux@fluff.org>
Acked-by: Richard Purdie <rpurdie@rpsys.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Implement the scheduled unexport of panic_timeout.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Ulrich suggested that the `flags' arg to sync_file_range() become unsigned.
Cc: Ulrich Drepper <drepper@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Some string functions were safely overrideable in lib/string.c, but their
corresponding declarations in linux/string.h were not. Correct this, and
make strcspn overrideable.
Odds of someone wanting to do optimized assembly of these are small, but
for the sake of cleanliness, might as well bring them into line with the
rest of the file.
Signed-off-by: Kyle McMartin <kyle@parisc-linux.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Current implementations define NODES_SHIFT in include/asm-xxx/numnodes.h for
each arch. Its definition is sometimes configurable. Indeed, ia64 defines 5
NODES_SHIFT values in the current git tree. But it looks a bit messy.
SGI-SN2(ia64) system requires 1024 nodes, and the number of nodes already has
been changeable by config. Suitable node's number may be changed in the
future even if it is other architecture. So, I wrote configurable node's
number.
This patch set defines just default value for each arch which needs multi
nodes except ia64. But, it is easy to change to configurable if necessary.
On ia64 the number of nodes can be already configured in generic ia64 and SN2
config. But, NODES_SHIFT is defined for DIG64 and HP'S machine too. So, I
changed it so that all platforms can be configured via CONFIG_NODES_SHIFT. It
would be simpler.
See also: http://marc.theaimsgroup.com/?l=linux-kernel&m=114358010523896&w=2
Signed-off-by: Yasunori Goto <y-goto@jp.fujitsu.com>
Cc: Hirokazu Takata <takata@linux-m32r.org>
Cc: "Luck, Tony" <tony.luck@intel.com>
Cc: Andi Kleen <ak@muc.de>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Ivan Kokshaysky <ink@jurassic.park.msu.ru>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Kyle McMartin <kyle@mcmartin.ca>
Cc: Russell King <rmk@arm.linux.org.uk>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Jack Steiner <steiner@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
include/asm/atomic.h:94: warning: implicit declaration of function 'unlikely'
include/asm/atomic.h:97: warning: implicit declaration of function 'likely'
Signed-off-by: Dave Jones <davej@redhat.com>
Cc: Martin Schwidefsky <schwidefsky@de.ibm.com>
Cc: Heiko Carstens <heiko.carstens@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make the length of ebcdic<->ascii conversion arrays known. This avoid
warnings with source code checking tools.
Signed-off-by: Peter Oberparleiter <peter.oberparleiter@de.ibm.com>
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Move the prototype from arch-generic to arch-specific includes because on
x86_64 these functions are two static inlines.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Cc: Jeff Dike <jdike@addtoit.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce GFP_NOWAIT, as an alias for GFP_ATOMIC & ~__GFP_HIGH.
This also changes XFS, which is the only in-tree user of this idiom that I
could find. The XFS piece is compile-tested only.
Signed-off-by: Jeff Dike <jdike@addtoit.com>
Acked-by: Nathan Scott <nathans@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Update {get,put}_user macros for m32r kernel.
- Modify get_user to use __get_user_asm macro, instead of __get_user_x macro.
- Remove arch/m32r/lib/{get,put}user.S.
- Some cosmetic updates.
I would like to thank NIIBE Yutaka for his reporting about the m32r kernel's
security problem in {get,put}_user macros.
There were no address checking for user space access in {get,put}_user macros.
;-)
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Cc: NIIBE Yutaka <gniibe@fsij.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch fixes a boot problem of the m32r SMP kernel 2.6.16-rc1-mm3 or
later.
In this patch, cpu_possible_map is statically initialized, and cpu_present_map
is also copied from cpu_possible_map in smp_prepare_cpus(), because the m32r
architecture has not supported CPU hotplug yet.
Signed-off-by: Hayato Fujiwara <fujiwara.hayato@renesas.com>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Cc: <stable@kernel.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Add some documentation regarding the utilisation of the flags field in
struct page. This field is overloaded for per page bits and to hold node,
zone and SPARSEMEM information. Make it clear which areas are used for
what and how many bits are in each area.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
These patches are an enhancement of OVERCOMMIT_GUESS algorithm in
__vm_enough_memory().
- why the kernel needed patching
When the kernel can't allocate anonymous pages in practice, currnet
OVERCOMMIT_GUESS could return success. This implementation might be
the cause of oom kill in memory pressure situation.
If the Linux runs with page reservation features like
/proc/sys/vm/lowmem_reserve_ratio and without swap region, I think
the oom kill occurs easily.
- the overall design approach in the patch
When the OVERCOMMET_GUESS algorithm calculates number of free pages,
the reserved free pages are regarded as non-free pages.
This change helps to avoid the pitfall that the number of free pages
become less than the number which the kernel tries to keep free.
- testing results
I tested the patches using my test kernel module.
If the patches aren't applied to the kernel, __vm_enough_memory()
returns success in the situation but autual page allocation is
failed.
On the other hand, if the patches are applied to the kernel, memory
allocation failure is avoided since __vm_enough_memory() returns
failure in the situation.
I checked that on i386 SMP 16GB memory machine. I haven't tested on
nommu environment currently.
This patch adds totalreserve_pages for __vm_enough_memory().
Calculate_totalreserve_pages() checks maximum lowmem_reserve pages and
pages_high in each zone. Finally, the function stores the sum of each
zone to totalreserve_pages.
The totalreserve_pages is calculated when the VM is initilized.
And the variable is updated when /proc/sys/vm/lowmem_reserve_raito
or /proc/sys/vm/min_free_kbytes are changed.
Signed-off-by: Hideo Aoki <haoki@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
for_each_cpu() actually iterates across all possible CPUs. We've had mistakes
in the past where people were using for_each_cpu() where they should have been
iterating across only online or present CPUs. This is inefficient and
possibly buggy.
We're renaming for_each_cpu() to for_each_possible_cpu() to avoid this in the
future.
This patch replaces for_each_cpu with for_each_possible_cpu.
for sparc64.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Acked-by: "David S. Miller" <davem@davemloft.net>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
reshape_position is a 64bit field that was not 64bit aligned. So swap with
new_level.
NOTE: this is a user-visible change. However:
- The bad code has not appeared in a released kernel
- This code is still marked 'experimental'
- This only affects version-1 superblock, which are not in wide use
- These field are only used (rather than simply reported) by user-space
tools in extemely rare circumstances : after a reshape crashes in the
first second of the reshape process.
So I believe that, at this stage, the change is safe. Especially if people
heed the 'help' message on use mdadm-2.4.1.
Signed-off-by: Neil Brown <neilb@suse.de>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Oleg Nesterov spotted two interesting bugs with the current de_thread
code. The simplest is a long standing double decrement of
__get_cpu_var(process_counts) in __unhash_process. Caused by
two processes exiting when only one was created.
The other is that since we no longer detach from the thread_group list
it is possible for do_each_thread when run under the tasklist_lock to
see the same task_struct twice. Once on the task list as a
thread_group_leader, and once on the thread list of another
thread.
The double appearance in do_each_thread can cause a double increment
of mm_core_waiters in zap_threads resulting in problems later on in
coredump_wait.
To remedy those two problems this patch takes the simple approach
of changing the old thread group leader into a child thread.
The only routine in release_task that cares is __unhash_process,
and it can be trivially seen that we handle cleaning up a
thread group leader properly.
Since de_thread doesn't change the pid of the exiting leader process
and instead shares it with the new leader process. I change
thread_group_leader to recognize group leadership based on the
group_leader field and not based on pids. This should also be
slightly cheaper then the existing thread_group_leader macro.
I performed a quick audit and I couldn't see any user of
thread_group_leader that cared about the difference.
Signed-off-by: Eric W. Biederman <ebiederm@xmission.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Patch from Catalin Marinas
The X variants are deprecated starting with ARMv6. Using the D variants,
the fpmx_state in vfp_hard_struct is no longer needed.
Signed-off-by: Catalin Marinas <catalin.marinas@arm.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Overriding the whole EH code is a per-transport, not per-host thing.
Move ->eh_strategy_handler to the transport class, same as
->eh_timed_out.
Downside is that scsi_host_alloc can't check for the total lack of EH
anymore, but the transition period from old EH where we needed it is
long gone already.
Signed-off-by: Christoph Hellwig <hch@lst.de>
Signed-off-by: Jeff Garzik <jeff@garzik.org>
Rohit found an obscure bug causing buddy list corruption.
page_is_buddy is using a non-atomic test (PagePrivate && page_count == 0)
to determine whether or not a free page's buddy is itself free and in the
buddy lists.
Each of the conjuncts may be true at different times due to unrelated
conditions, so the non-atomic page_is_buddy test may find each conjunct to
be true even if they were not both true at the same time (ie. the page was
not on the buddy lists).
Signed-off-by: Martin Bligh <mbligh@google.com>
Signed-off-by: Rohit Seth <rohitseth@google.com>
Signed-off-by: Nick Piggin <npiggin@suse.de>
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Deinline a few functions which produce 200+ bytes of code.
Size Uses Wasted Name and definition
===== ==== ====== ================================================
429 3 818 __inet6_lookup include/net/inet6_hashtables.h
404 2 384 __inet6_lookup_established include/net/inet6_hashtables.h
206 3 372 __inet6_hash include/net/inet6_hashtables.h
Signed-off-by: Denis Vlasenko <vda@ilport.com.ua>
Signed-off-by: David S. Miller <davem@davemloft.net>
Add checksum operation which takes care of verifying the checksum and
dealing with HW checksum errors and avoids multiple checksum
operations by setting ip_summed to CHECKSUM_UNNECESSARY after
successful verification.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Change the queue rerouter intrastructure to a generic usable
infrastructure for address family specific operations as a base for
some cleanups.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Move prototypes of NAT callbacks to ip_conntrack_h323.h. Because the
use of typedefs as arguments, some header files need to be moved as
well.
Signed-off-by: Jing Min Zhao <zhaojingmin@users.sourceforge.net>
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
The conntrack code doesn't do re-fragmentation of defragmented packets
anymore but relies on fragmentation in the IP layer. Purely bridged
packets don't pass through the IP layer, so the bridge netfilter code
needs to take care of fragmentation itself.
Signed-off-by: Patrick McHardy <kaber@trash.net>
Signed-off-by: David S. Miller <davem@davemloft.net>
Patch from Lennert Buytenhek
The debug-8250 macros do byte accesses, which means that if we're in
big-endian mode, we need to logically OR the UART address with 3, as
the LSB byte lane (where UART data and status is transferred) has the
highest byte address in the word when we are in big-endian mode.
It's unclear why this problem didn't surface earlier.
Signed-off-by: Lennert Buytenhek <buytenh@wantstofly.org>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Or rather compute it based on the table length automatically.
This also has the intended side effect of not warning for new system calls
anymore.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Fix CONFIG_REORDER.
The value of cflags-y was assined to CFLAGS before cflags-y was assigned
the value used for CONFIG_REORDER.
Use cflags-y for all CFLAGS options in the Makefile to avoid this
happening again.
Signed-off-by: Sam Ravnborg <sam@ravnborg.org>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
If the HPET timer is enabled, the clock can drift by ~3 seconds a day.
This is due to the HPET timer not being initialized with the correct
setting (still using PIT count).
If HZ changes, this drift can become even more pronounced.
HPET patch initializes tick_nsec with correct tick_nsec settings for
HPET timer.
Vojtech comments:
"It's not entirely correct (it assumes the HPET ticks totally
exactly), but it's significantly better than assuming the PIT error
there."
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Machine checks can stall the machine for a long time and
it's not good to trigger the nmi watchdog during that.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The generic linux/numa.h file defines NODES_SHIFT to 0 in case
the architecture did not.
Every architecture which has a NUMA config option defines
NODES_SHIFT in its asm-$ARCH headers, but only if NUMA is
enabled, except for x86_64.
This should make it like all the rest.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
AMD systems have a modern APIC that supports 8 bit IDs, but
don't have a XAPIC version number. Add a new "modern_apic"
subfunction that handles this correctly and use it (nearly)
everywhere where XAPIC is tested for.
I removed one wart: the code specified that external APICs
would use an 8bit APIC ID. But I checked a real 82093 data sheet
and it says clearly that they only use 4bit. So I removed
this special case since it would a bit awkward to implement now.
I removed the valid APIC tests in mptable parsing completely. On any modern
system they only check against the full field width (8bit) anyways
and are no-ops. This also fixes them doing the wrong thing
on >8 core Opterons.
This makes i386 boot again on 16 core Opterons.
Cc: Ingo Molnar <mingo@elte.hu>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Introduce a e820_all_mapped() function which checks if the entire range
<start,end> is mapped with type.
This is done by moving the local start variable to the end of each
known-good region; if at the end of the function the start address is
still before end, there must be a part that's not of the correct type;
otherwise it's a good region.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Rename e820_mapped to e820_any_mapped since it tests if any part of the
range is mapped according to the type.
Later steps will introduce e820_all_mapped which will check if the
entire range is mapped with the type. Both have their merit.
Signed-off-by: Arjan van de Ven <arjan@linux.intel.com>
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The node setup code would try to allocate the node metadata in the node
itself, but that fails if there is no memory in there.
This can happen with memory hotplug when the hotplug area defines an so
far empty node.
Now use bootmem to try to allocate the mem_map in other nodes.
And if it fails don't panic, but just ignore the node.
To make this work I added a new __alloc_bootmem_nopanic function that
does what its name implies.
TBD should try to use nearby nodes here. Currently we just use any.
It's hard to do it better because bootmem doesn't have proper fallback
lists yet.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
From: Keith Mannthey, Andi Kleen
Implement memory hotadd without sparsemem. The memory in the SRAT
hotadd area is just preserved instead and can be activated later.
There are a few restrictions:
- Only one continuous hotadd area allowed per node
The main problem is dealing with the many buggy SRAT tables
that are out there. The strategy here is to reject anything
suspicious.
Originally from Keith Mannthey, with several hacks and changes by AK
and also contributions from Andrew Morton
[ TBD: Problems pointed out by KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitsu.com>:
1) Goto's rebuild_zonelist patch will not work if CONFIG_MEMORY_HOTPLUG=n.
Rebuilding zonelist is necessary when the system has just memory <
4G at boot, and hot add memory > 4G. because x86_64 has DMA32,
ZONE_NORAML is not included into zonelist at boot time if system
doesn't have memory >4G at boot.
[AK: should just force the higher zones at boot time when SRAT tells us]
2) zone and node's spanned_pages and present_pages are not incremented.
They should be.
For example, our server (ia64/Fujitsu PrimeQuest) can equip memory
from 4G to 1T(maybe 2T in future), and SRAT will *always* say we have
possible 1T +memory. (Microsoft requires "write all possible memory
in SRAT") When we reserve memmap for possible 1T memory, Linux will
not work well in +minimum 4G configuraion ;)
[AK: needs limiting to 5-10% of max memory]
]
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Memory hotadd doesn't need SPARSEMEM, but can be handled by just preallocating
mem_maps. This only needs some untangling of ifdefs to enable the necessary
code even without SPARSEMEM.
Originally from Keith Mannthey, hacked by AK.
Signed-off-by: Andi Kleen <ak@suse.de>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
FLUSH_BASE must be visible to arch/arm/mm/init.c in order for the
memory region to be setup. Move these definitions from
asm-arm/arch-*/hardware.h into asm-arm/arch-*/memory.h where mm
stuff can see them.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This patch fixes arch_local_page_offset(pfn,nid) in arm.
This new one (added by unify_pfn_to_page patches) is obviously buggy.
This macro calculate page offset in a node.
Note: about LOCAL_MAP_NR()
comment in arm's sub-archs says...
/*
* Given a kaddr, LOCAL_MAP_NR finds the owning node of the memory
* and returns the index corresponding to the appropriate page in the
* node's mem_map.
*/
but LOCAL_MAP_NR() is designed to be able to take both paddr and kaddr.
In this case, paddr is better.
Signed-off-by: KAMEZAWA Hiroyuki <kamezawa.hiroyu@jp.fujitu.com>
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>