Introduce a simple allocator for the NUMA remap space. This space is very
scarce, used for structures which are best allocated node local.
This mechanism is also used on non-NUMA ia64 systems with a vmem_map to keep
the pgdat->node_mem_map initialized in a consistent place for all
architectures.
Issues:
o alloc_remap takes a node_id where we might expect a pgdat which was intended
to allow us to allocate the pgdat's using this mechanism; which we do not yet
do. Could have alloc_remap_node() and alloc_remap_nid() for this purpose.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The following four patches provide the last needed changes before the
introduction of sparsemem. For a more complete description of what this
will do, please see this patch:
http://www.sr71.net/patches/2.6.11/2.6.11-bk7-mhp1/broken-out/B-sparse-150-sparsemem.patch
or previous posts on the subject:
http://marc.theaimsgroup.com/?t=110868540700001&r=1&w=2http://marc.theaimsgroup.com/?l=linux-mm&m=109897373315016&w=2
Three of these are i386-only, but one of them reorganizes the macros
used to manage the space in page->flags, and will affect all platforms.
There are analogous patches to the i386 ones for ppc64, ia64, and
x86_64, but those will be submitted by the normal arch maintainers.
The combination of the four patches has been test-booted on a variety of
i386 hardware, and compiled for ppc64, i386, and x86-64 with about 17
different .configs. It's also been runtime-tested on ia64 configs (with
more patches on top).
This patch:
We _know_ which node pages in general belong to, at least at a very gross
level in node_{start,end}_pfn[]. Use those to target the allocations of
pages.
Signed-off-by: Andy Whitcroft <apw@shadowen.org>
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch effectively eliminates direct use of pgdat->node_mem_map outside
of the DISCONTIG code. On a flat memory system, these fields aren't
currently used, neither are they on a sparsemem system.
There was also a node_mem_map(nid) macro on many architectures. Its use
along with the use of ->node_mem_map itself was not consistent. It has
been removed in favor of two new, more explicit, arch-independent macros:
pgdat_page_nr(pgdat, pagenr)
nid_page_nr(nid, pagenr)
I called them "pgdat" and "nid" because we overload the term "node" to mean
"NUMA node", "DISCONTIG node" or "pg_data_t" in very confusing ways. I
believe the newer names are much clearer.
These macros can be overridden in the sparsemem case with a theoretically
slower operation using node_start_pfn and pfn_to_page(), instead. We could
make this the only behavior if people want, but I don't want to change too
much at once. One thing at a time.
This patch removes more code than it adds.
Compile tested on alpha, alpha discontig, arm, arm-discontig, i386, i386
generic, NUMAQ, Summit, ppc64, ppc64 discontig, and x86_64. Full list
here: http://sr71.net/patches/2.6.12/2.6.12-rc1-mhp2/configs/
Boot tested on NUMAQ, x86 SMP and ppc64 power4/5 LPARs.
Signed-off-by: Dave Hansen <haveblue@us.ibm.com>
Signed-off-by: Martin J. Bligh <mbligh@aracnet.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently reset and powerdown are not implemented on the Maple board,
and attempting to do so will (incorrectly return). This implements
the proper communication with the service processor, allowing correct
reset and powerdown on the Maple board, by communicating with the
service processor. If somehow it's unable to communicate with the
service processor it will loop forever instead.
Note that powerdown on the Maple will power down the CPUs, but not the
fans or other board components due to hardware and firmware
limitations.
Signed-off-by: David Gibson <dwg@au1.ibm.com>
Signed-off-by: Frank Rowand <frowand@mvista.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
For I/O DLPAR to work properly, the kernel needs to allow for dynamic
assignment of the irq field of the pci_dev structure upon dynamic bus
addition. This patch moves the assignment of that field from
pSeries_final_fixup() to pcibios_fixup_bus(), which enables dynamic
assignment for the children of a newly added bus.
Currently, pci_devs receive their irq numbers in one of two ways. The
irq line is either read at boot for all pci_devs, or read by the rpaphp
module at slot enable time. The latter is no longer sufficient for
DLPAR addition of slots that don't qualify as PCI-hotplug capable.
This solution handles the cases of boot and dynamic add.
Signed-off-by: John Rose <johnrose@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This patch corrects the printing of progress indicators to the op
panel on p/iSeries ppc64 systems. Each discrete reference code should
begin with a form feed char to clear the op panel, and the first and
second lines should be separated with a CR/LF sequence. Padding with
spaces is not necessary.
Also, capitalize the hex value printed on the first line, to be
consistent with the values printed by firmware, service processor,
etc.
It turns out that there's an ibm,form-feed property; this patch uses
it in the pSeries-specific progress routine. This patch also checks
the number of rows and the specific width of each row (the second row
on power5 systems can actually hold 80 characters). If the displayed
text is too wide for the physical display, it can be viewed in the ASM
menus, or by selecting option 14 on the op panel.
Signed-off-by: Mike Strosaker <strosake@austin.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Implementation of software load support for the BE iommu. This is very
different from other iommu code on ppc64, since we only do a static mapping.
The mapping is currently hardcoded but should really be read from the
firmware, but they don't set up the device nodes yet. There is a single
512MB DMA window for PCI, USB and ethernet at 0x20000000 for our RAM.
The Cell processor can put the I/O page table either in memory like
the hashed page table (hardware load) or have the operating system
write the entries into memory mapped CPU registers (software load).
I use the software load mechanism because I know that all I/O page
table entries for the amount of installed physical memory fit into
the IO TLB cache. At the point when we get machines with more than
4GB of installed memory, we can either use hardware I/O page table
access like the other platforms do or dynamically update the I/O
TLB entries when a page fault occurs in the I/O subsystem.
The software load can then use the macros that I have implemented
for the static mapping in order to do the TLB cache updates.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Add support for the integrated interrupt controller on BPA
CPUs. There is one of those for each SMT thread.
The mapping of interrupt numbers to HW interrupt sources
is described in arch/ppc64/kernel/bpa_iic.h.
This version hardcodes the 'Spider' chip as the secondary
interrupt controller. That is not really generic for the
architecture, but at the moment it is the only secondary
PIC that exists.
A little more work will be needed on this as soon as
we have boards with multiple external interrupt controllers.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
This adds the basic support for running on BPA machines.
So far, this is only the IBM workstation, and it will
not run on others without a little more generalization.
It should be possible to configure a kernel for any
combination of CONFIG_PPC_BPA with any of the other
multiplatform targets.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The firmware provides the location and size of the nvram
in the device tree, so it does not really contain any
hardware specific bits and could be used on other
machines as well.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The pSeries_progress function is called from some places in the rtas code,
which may also be used by non-pSeries platforms.
Though pSeries is currently the only platform type that implements
display-character, the code is actually generic enough to be part of
the rtas subsystem.
I hit a bug here because the generic rtas code tried calling ppc_md.progress,
which points to an __init function on most platforms.
We could also clear the ppc_md.progress pointer when freeing the init memory
to make it more explicit that ppc_md.progress must not be called after
bootup.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
BPA is using rtas for PCI but should not be confused by
pSeries code. This also avoids some #ifdefs. Other
platforms that want to use rtas_pci.c could create
their own platform_pci.c with platform specific fixups.
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
The rtc rtas functions are not pSeries specific but can
also be used by BPA and other SLOF based platforms
Signed-off-by: Arnd Bergmann <arndb@de.ibm.com>
Signed-off-by: Paul Mackerras <paulus@samba.org>
pSeries and maple have almost the same code for calibrate_decr,
and BPA would need yet another copy. Instead, I'm moving the
code to arch/ppc64/kernel/time.c.
Some of the related declarations were missing from header
files, so I'm moving those as well.
It makes sense to merge this with the pmac function of the
same name, so we end up having just one implemetation for
iSeries and one for Open Firmware based machines.
Signed-off-by: Arnd Bergmann <arnd@arndb.de>
Signed-off-by: Paul Mackerras <paulus@samba.org>
Since meminfo.bank[] array contains page-aligned start/size, we
no longer need to explicitly round up/down the addresses when
converting to PFNs.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
Adding support for MPC8548 w/o PCI support, broke building MPC8555 CDS
by trying to remove a loop variable that was used when PCI is enabled.
Signed-off-by: Kumar Gala <kumar.gala@freescale.com>
Signed-off-by: Linus Torvalds <torvalds@osdl.org)
Move the signal return code into the vector page instead of placing
it on the user mode stack, which will allow us to avoid flushing
the instruction cache on signals, as well as eventually allowing
non-exec stack.
Signed-off-by: Russell King <rmk+kernel@arm.linux.org.uk>
This change provides support for the DS1374 Real-Time Clock chip present
on the MPC8349ADS board. It depends on a previous patch which adds I2C
support for the DS1374.
Signed-off-by: Randy Vinson <rvinson@mvista.com>
Signed-off-by: Greg Kroah-Hartman <gregkh@suse.de>
Wait for interrupt and clear status pending after resetting the reader.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The kernel takes a very long time to boot if the memory size is bigger then
32767 MB. The memory size is contained in a structure created by an sclp
call. The kernel accesses the field with a LH instrution which performs a
sign extension of a 16 bit word. In the case of a memory size with bit 2^15
set this results in a very large value and the memory detection just loops for
a long time. In addition if more then 64 GB are used on a 64 bit system the
memory size is read from an incorrect storage location.
Use zero-extention to read the 16 bit memory size and the correct offset to
read the 4 byte memory size on 64 bit.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Make cmm module parameter "sender" visible in sysfs.
Signed-off-by: Martin Schwidefsky <schwidefsky@de.ibm.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Al Viro <viro@parcelfarce.linux.theplanet.co.uk>
To make sure switcheroo() can execute when we remap all the executable
image, we used a trick to make it use a local copy of errno... this trick
does not work with NPTL glibc, only with LinuxThreads, so use another
(simpler) one to make it work anyway.
Hopefully, a lot improved thanks to merging with the version of Al Viro
(which had his part of problems, though, i.e. removing a fix to another
bug and not fixing the problem on i386).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Chris Wedgwood <cw@f00f.org>
As suggested by Chris, we can make the "just added" method ->release
conditional to UML only (better: to archs requesting it, i.e. only UML
currently), so that other archs don't get this unneeded crud, and if UML
won't need it any more we can kill this.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This occurrence of free_irq_by_irq_and_dev() was missed when converting UML
to the use of hw_controller_type->release.
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
With Chris Wedgwood <cw@f00f.org>
Currently UML must explicitly call the UML-specific
free_irq_by_irq_and_dev() for each free_irq call it's done.
This is needed because ->shutdown and/or ->disable are only called when the
last "action" for that irq is removed.
Instead, for UML shared IRQs (UML IRQs are very often, if not always,
shared), for each dev_id some setup is done, which must be cleared on the
release of that fd. For instance, for each open console a new instance
(i.e. new dev_id) of the same IRQ is requested().
Exactly, a fd is stored in an array (pollfds), which is after read by a
host thread and passed to poll(). Each event registered by poll() triggers
an interrupt. So, for each free_irq() we must remove the corresponding
host fd from the table, which we do via this -release() method.
In this patch we add an appropriate hook for this, and remove all uses of
it by pointing the hook to the said procedure; this is safe to do since the
said procedure.
Also some cosmetic improvements are included.
This is heavily based on some work by Chris Wedgwood, which however didn't
get the patch merged for something I'd call a "misunderstanding" (the need
for this patch wasn't cleanly explained, thus adding the generic hook was
felt as undesirable).
Signed-off-by: Paolo 'Blaisorblade' Giarrusso <blaisorblade@yahoo.it>
CC: Ingo Molnar <mingo@redhat.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patchset is for supporting a new m32r platform, M3A-2170(Mappi-III)
evaluation board. An M32R chip multiprocessor is equipped on the board.
http://http://www.linux-m32r.org/eng/platform/platform.html
* arch/m32r/Kconfig: Support Mappi-III platform.
* arch/m32r/kernel/Makefile: ditto.
* arch/m32r/kernel/io_mappi3.c: ditto.
* arch/m32r/kernel/setup.c: ditto.
* arch/m32r/kernel/setup_mappi3.c: ditto.
* include/asm-m32r/m32102.h: ditto.
* include/asm-m32r/m32r.h: ditto.
* include/asm-m32r/mappi3/mappi3_pld.h: ditto.
* include/asm-m32r/ide.h: CF support for Mappi-III.
* arch/m32r/kernel/setup_mappi3.c: ditto.
* arch/m32r/mappi3/defconfig.smp: A default config file for Mappi-III.
* arch/m32r/mappi3/dot.gdbinit: A default .gdbinit file for Mappi-III.
* arch/m32r/boot/compressed/m32r_sio.c: Modified for Mappi-III
- At boot time, m32r-g00ff bootloader makes MMU off for Mappi-III,
on the contrary it makes MMU on for Mappi-II.
* arch/m32r/kernel/io_mappi2.c: Update comments.
* arch/m32r/kernel/setup_mappi2.c: ditto.
Signed-off-by: Mamoru Sakugawa <sakugawa@linux-m32r.org>
Signed-off-by: Hirokazu Takata <takata@linux-m32r.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The SGI IOC4 I/O controller chip drivers are currently all configured by
CONFIG_BLK_DEV_SGIIOC4. This is undesirable as not all IOC4 hardware features
are needed by all systems.
This patch adds two configuration variables, CONFIG_SGI_IOC4 for core IOC4
driver support (see patch 1/3 in this series for further explanation) and
CONFIG_SERIAL_SGI_IOC4 to independently enable serial port support.
Signed-off-by: Brent Casavant <bcasavan@sgi.com>
Acked-by: Pat Gefre <pfg@sgi.com>
Acked-by: Jeremy Higdon <jeremy@sgi.com>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Allow the SMT bit to be set/reset at boot, like the ALTIVEC bit. This
means we will enable SMT on unknown cpus that support it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
We dont use the hardware referenced and changed bits and setting them early
avoids a store to memory. We already do this for userspace hptes but not
kernel ones. Do it.
Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Currently we dynamically allocate the fake parent device for all devices on
the vio bus. This patch statically allocates it. This also allows us to
reuse it for the iSeries "generic" vio device (that is used for passing to
dma routines when communicating with the hypervisor without a device
involved). Also unexport vio_bus_type as it is never used in modules.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch allows iSeries to build with CONFIG_PCI=n. This is useful for
partitions that have only virtual I/O.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch just removes some dead code, fixes messages that referred to the
file this code used to be in and inserts XmPciLpEvent_init into its caller.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch just merges XmPciLpEvent.c into iSeries_irq.c (the only caller of
its only external function). XmPciLpEvent.c just contained the lowlevel
iSeries irq code.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
This patch is just simple cleanups to the iSeries irq code.
- whitespace and comments
- rearrange some functions to avoid forward declarations
- remove XmPciLpEvent.h as its functions were declared elsewhere
- remove decaration of function that no longer exists
No semantic changes.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The AgentId, PhbId, FrameId, CardLocation and Location members of
iSeries_Device_Node are stored early in the boot process just so that a
message about the device can be printed later in the boot process. Remove
them and construct the message by doing the VPD parsing at the time the
message is printed.
Also remove a few unused defines in iSeries_VpdInfo.c.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The IoRetry member of iSeries_Devide_Node is really only used locally, so
remove it and replace it with a local variable.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Remove no longer used things from iSeries_pci.h.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Clean up iSeries_VpdInfo.c:
- white space and comment fixes
- make a function static
- the functions here are only called from iSeries_pci.c, so
CONFIG_PCI will be set (so remove check)
- only build when CONFIG_PCI is set
- remove unneeded includes and cast
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
The file arch/ppc64/kernel/iSeries_pci_reset contains only one function that
is not use anywhere (any more). Remove it. This function is the only user of
the ReturnCode member of iSeries_Device_Node, so remove that as well.
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>
Last of this round of the iSeries header cleanups
- don't have two defines for the same thing (HvMaxArchitectedLps
and HvMaxArchitectedVirtualLans)
- HvCallSc.h only needs linux/types.h
- remove unused struct definition
- add "extern" to some more function declarations
Signed-off-by: Stephen Rothwell <sfr@canb.auug.org.au>
Signed-off-by: Andrew Morton <akpm@osdl.org>
Signed-off-by: Linus Torvalds <torvalds@osdl.org>