About site: Hardware/Components/Processors/Multiprocessors - PiSMA: A Parallel VSM Architecture
Return to Computers also Computers
  About site: http://www.acm.org/crossroads/xrds5-3/pisma.html

Title: Hardware/Components/Processors/Multiprocessors - PiSMA: A Parallel VSM Architecture Parallel vIrtually Shared Memory Architecture; merges benefits of shared and distributed memory models, secures confidential communication via authentication. [ACM Crossroads]
be_rec_photography Charter and FAQ for the Belgian Usenet newsgroup about photography be.rec.photography. Includes photography links.

InfraHost Internet Managed services and dedicated hosting offering for the small and medium-sized market.

LTF-Cimulator LTF-C neural networks simulator for solving classification problems.

RFC_1637 DNS NSAP Resource Records. B. Manning, R. Colella. June 1994.

RFC_1291 Mid-Level Networks Potential Technical Services. V. Aggarwal. December 1991.

Utilitran Powerful java-based software engine for B2B trading partner networks.


  Alexa statistic for http://www.acm.org/crossroads/xrds5-3/pisma.html





Get your Google PageRank






Please visit: http://www.acm.org/crossroads/xrds5-3/pisma.html


  Related sites for http://www.acm.org/crossroads/xrds5-3/pisma.html
    The_Official_Bee-Line_Homepage A European illustration system made by illustrators, who have used a happy, positive and humorous view of life as the common basis for their drawings. Several products are available.
    Screenblast Provides broadband content that includes audio and video clips with creation tools by Sonic Foundry: Acid, Sound Forge, Siren Jukebox and Video Factory.
    Computer_History__IBM_360/370/3090/390 A personal history of the IBM mainframe by Lars Poulsen.
    Bruce_Lindbloom Color calculators and spreadsheets, RGB reference images, information on Gamut mapping of colors and RGB working space for digital imaging and color science. A set of eleven Lab TIFF images. Each imag
    Softwinter Offers tools for transparent encryption on Pocket PC and Windows based computers.
    Html_Email_Marketing Provides Internet online direct bulk email marketing software, opt-in email list builder and newsletter publishing.
    H-log_com General directory, logging sites in an SEO friendly way. Also offers free hosting and other resources.
    ThingLabII__Constraint-based_user_interfaces Supports exploration of constraint-based user interfaces; consists of a set of classes that define constraints and constrainable objects called things. Incremental constraint satisfier, module compile
    Advanced_Network_&_Services,_Inc_ Working to advance education by accelerating the use of computer network applications and technology.
    Matriarch_Systems Developers of PC and Network based telephone voice recording software and systems.
    Smart_Score Converts printed sheet music into multi-track MIDI files. Handles virtually any type of score, 90-98% accuracy. TWAIN compatible.
    Great_Deals_Software Software and games for both Windows and Macintosh computers, at discount prices.
    C#_Strikes_a_Chord Compares 4 languages: C, C++, C#, Java. Shows family genealogy. [Dr. Dobb's Journal] (September 4, 2000)
    Official_Fidonet_web-site The main official web-site of FidoNet with a lot of information.
    soc_couples_wedding A newsgroup for discussing all aspects of wedding planning. The FAQ addresses these issues and also proper etiquette in this newsgroup.
    RFC_3135 Performance Enhancing Proxies Intended to Mitigate Link-Related Degradations. J. Border, M. Kojo, J. Griner, G. Montenegro, Z. Shelby. June 2001.
    GridRepublic A non-profit organization created to provide supercomputing resources to public interest research, by use of volunteer distributed computing.
    Vsoft_Services Develops solutions for small business including payroll management, retailer, estimating, accounts and inventory software.
    HTML_Online_Reference Offering online HTML reference. Includes tips and tricks. [Atom]
    Template_Toolkit A extensible Open Source tool written in Perl for building dynamic web content.
This is websites2007.org cache of m/ as retrieved on 2008.09.05 websites2007.org's cache is the snapshot that we took of the page as we crawled the web. The page may have changed since that time.
PiSMA: A Parallel VSM Architecture

PiSMA: A Parallel VSM Architecture

by Dimitris Lioupis, AndreasPipis, Maria Smirli, and MichaelStefanidakis

Introduction

There is an increased interest in large scale multiprocessorscomposed of a large number of processing elements. The two main modelsof multiprocessor architectures, the shared memory model [10] and the distributed memory model [1, 2, 3,4, 11], suffer from seriouslimitations because of their nature. Distributed memory architecturespromise better scalability and increased performance but are moredifficult to program efficiently. Shared memory architectures, on theother hand, have a better programming model. Their disadvantage,however, is the limited bandwidth of the shared bus, which restrictsscalability. This paper presents a parallel computer architecture, called PiSMA(Parallel vIrtually Shared Memory Architecture), introduced in [5, 6]. PiSMA architecturecombines the benefits of both multiprocessor models mentioned above,resulting in an efficient parallel architecture. The simplicity of theunderlying hardware permits the scaling of PiSMA, utilizing newgenerations of microprocessors, without the need to redesign the wholesystem. PiSMA can be used with or without a global communicationnetwork depending on the type of applications it is used for, asdescribed in [9]. In the following sections adetailed description of the PiSMA architecture, its programming model,the Virtual Memory management and message passingsupport mechanisms are presented.

Description of the PiSMA Architecture

The PiSMA architecture forms an expandable toroidal grid withalternating processors and memories: each processor is connected tofour memories and each memory to four processors as shown in Figure 1. Figure 1. PiSMAArchitectureThis structure enables each processor to communicate directly(through a common memory) with up to eight neighboring processors asshown in Figure 2.Figure 2. Aprocessor and its eight neighborsCommunication with any other processor beyond these eight processors onthe grid is performed by message passing which is transparentto the user. The four ports to memory are implemented as four cachesconnected on the same bus with the memory module as depicted in Figure 3. Figure 3. ThePiSMA basic building block: a 4 processor/<b>memory</b> tileThe processor-memory cluster consists of two basic boards, namelythe processor board and the memory board. The processor board has twoversions implementing the desired connectivity and allowing for easyexpansion. The memory board contains four memory modules. Byconnecting the top and the bottom connector and the leftand the right connector, the smallest configuration possible (fourprocessors and four memories) is formed. The physical implementationof the architecture is described in detail in [5]. PiSMA is easily scalable, and biggerconfigurations can be easily constructed by repeatedly connectingtogether the basic building blocks shown in Figure3. A very important aspect of the basic PiSMA architecture is that nonetwork for intra-processor communication exists. Memory modules areused as a communication media instead, with messages being transferredto their destinations via consecutive memory-to-memory copying. Thiscopy operation is performed by processors common to each pair ofmemories. It was shown in [9] that this approachworks well for a wide range of parallel applications and that theaddition of a global communication network does not improveperformance significantly. Moreover, the absence of a network has theadvantage of easier scalability. In the case of applications withheavy global communication, however, the presence of anintra-processor communication network does improve the system'sperformance. In order to satisfy application demands, a networkcan be easily incorporated into the basic PiSMA hardware, as shown in[7], where PiSMA was used to execute specialpurpose applications such as Video on Demand.Fault tolerance is another important feature of PiSMA architecture; this capability comes from its structure shown in Figure 3, that provides alternative paths that can be used when a component(processor or memory) fails, as described in [9].

Programming model

Programs are executed on PiSMA by mapping the execution tree of aparallel program onto the grid of processors. The code is loaded in amemory module and the root of the tree (which is the first call to themain procedure) is randomly assigned to one of the adjacent processors. This processor starts executing the code and spreads work outdynamically to its neighbors, through its four adjacent memories. Thework distribution is performed by the dynamic load balancer,described in detail in [8]. Distributing workconsists of copying a work granule into a processor's adjacent memoryand inserting an entry into its task queue. A work granule is a self-contained piece of work (typically 50-500 lines of code). Each work granule consists of: a header, indicating the resources required for execution of this particular work granule the body, which contains the code to be executed. The dynamic load balancing algorithm is a key element in ourarchitecture. This algorithm attempts to evenly distribute the workgranules on the processing surface, avoiding, as much as possible,data transfers and creation of long communication paths in shared dataaccess. In order to achieve these conflicting goals, the loadbalancing algorithm draws on information included in the header ofeach work granule (such as code size and shared variable dependencies)to decide which neighboring processor is best suited to be assigned apiece of work. This decision is also based on data dependencies andeach processor's load.

Virtual Memory Management

The virtual memory management mechanism of the PiSMA architecturesupports the mixed addressing mode (direct access and messagepassing) of the architecture. The selection of the appropriateaddressing mode is completely dynamic and transparent to the applicationprogrammer. The dynamic nature of this mechanism imposes the fact thatresources' physical locations on PiSMA are unknown until runtime.The PiSMA programming model distinguishes between shared andprivate data. Private data are expected to reside in one of the fourmemories adjacent to the executing processor and are referencedrelatively through an offset. Referencing private variables isstraightforward; the processor presents the real address on its busand reads from the appropriate memory. There is no need for specialhardware to perform this transaction; the processor sees its fourneighboring memories as a uniform and contiguous memory space. Anycompiler that references data locally (as the majority of compilersdo) into the data segment can be used to handle this kind of data.Shared data may or may not be in an adjacent memory and arealways referenced through indirection pointers. When theapplication program executes, shared data may be directly accessible(if they reside in an adjacent memory to the executing processor memory), orthey may be remotely accessed (via message passing). Therefore it isnot possible to directly reference this kind of data in an absolutemanner, since its location in the system before runtime isunknown. The application compiler generates the appropriate code to supportthis accessing scheme. For each shared resource in the applicationprogram the respective indirection pointers are allocated. After that,shared resources are referenced through these pointers.The contents of indirection pointers are undefined at compile time. At runtime (specifically, during the mapping of thework onto the PiSMA grid), the location of shared variables isestablished when they are used for the first time and the associatedindirection pointers are initialized. When a shared variable is directly accessible by a processor, thecorresponding indirection pointer contains its physical address inone of the four adjacent memories. The executing processor simplyexecutes an indirect access to its physically connected memoryspace. In this way there is no special hardware or softwareintervention; shared data can be accessed as local data.This is not the case when the shared data resources are remote tothe executing processor. The value contained in the indirectionpointer in this case is a special memory address, designated togenerate an OS trap. The indirection pointer contains additionalinformation about the shared data location in the system, such as thephysical memory module, the address and usage status of thisparticular shared resource. The generated OS trap service routinetranslates the request into a remote-data requesting message,utilizing the resource's location information from the indirectionpointer.It should be clear that the PiSMA architecture does not utilize apure global virtual memory address space scheme. That is, theaddresses generated describe uniquely a physical location in thewhole virtual memory space and a translation hardware mechanismexists to map these addresses into physical space. Instead, PiSMA usesphysical addresses when accessing data directly, and relies on asoftware translation mechanism to generate messages for remote datarequests.PiSMA does require that its data have a fixedphysical location in the system's memory when it is used, as it is anon-Cache Only Memory architecture [11]. The fixed location of data resources in useavoids data coherency problems, and requires no global datainvalidation messages.

Message Passing Support

Messages generated from the operating system are forwarded totheir destination through system memory. Each processor has amessage queue defined in each of its four adjacent memories. In thesequeues the eight neighboring processors store the messages, whoserecipient is this particular processor. During the processor's normaloperation, an interrupt driven software message handler queries thequeues and services the pending messages. Messages are forwarded to their destination by packetswitching from processor to processor. Each message is equippedwith a header describing its type, its destination processor'slocation on the grid, and its originator (if this is necessary). Themessage route through the processor grid is not pre-established whenthe message is generated. Each processor, uponreceiving a message, decides to forward it to one of its eightneighboring processors, or to consume it itself. The forwardingdirection is found by comparing the destination processor's location,from the message header, with the forwarding processor's own location(it is assumed that each processor has the knowledge of itsown location).Message passing always imposes high latency to the application'sexecution. Messages kept in main memory are uncachable because oftheir nature, and message reading or writing operations take manyexecution cycles. The time spent for message handling on eachprocessor is another source of overhead, delaying the applicationexecution. The PiSMA architecture tries to minimize the message passinglatency by avoiding it or hiding it. In the first case, theadvantage of the physical communication through common adjacentmemories is used, avoiding the generation of messages. Processorsworking with shared resources on a common memory can execute veryquickly, without message intervention. It is the responsibility of thediffusion algorithm to assign work granules with common resources toneighboring processors, but efficient distribution of processes isnot always possible. Sometimes the diffusion algorithm hasinsufficient information to perform an optimal job distribution, andat other times the parallel application does not allow the distributionof its data set in different system memories. The result is anincrease in message traffic (which could be excessive). In order to minimize theimpact of such a situation, the operating system of PiSMA is designedto swap out processes waiting for remote data. In this way theprocessor can execute something useful, thus hiding wait time.

Conclusion

PiSMA is a Virtually Shared Memory Architecture that combines boththe benefits of the shared and distributed models. A detaileddescription of the PiSMA Architecture, its Virtual Memory Managementand Message Passing support was presented and it was shown thatparallel programming in PiSMA is an easy task, without the need forknowledge of the underlying hardware. All these details aretransparent to the programmer since the operating system kernel ofPiSMA takes care of them. The encapsulation of intelligent dynamicalgorithms in the operating system kernel, such as the diffusionalgorithm and the dynamic memory management, makes PiSMA a powerfulparallel distributed memory computer with an easy programminginterface compared to that of the pure shared memoryarchitectures.

References

1 A. Agarwal, R. Bianchini, D. Chaiken, K. Johnson, D. Kranz, J.Kubiatowicz,B. Lim, K. Mackenzie, D. Yeung, The MIT Alewife Machine: ArchitectureandPerformance. In Proceedings of the 22nd Int'l Symp. on ComputerArchitectures,Los Alamitos, California 1995 2S. Borkar et al. iWARP: An Integrated Solution to High SpeedParallelComputing. In Proceedings of Supercomputing'88, Nov. 1988.3J.T. Kuehn, B.J. Smith, The HORIZON Supercomputing System:Architectureand Software. In Proceedings of Supercomputing'88, Nov. 1988.4D. Lenoski, J. Laudon, K. Gharachorloo, W.-D. Weber, A. Gupta, J.Hennessy,M. Horowit, and M.S. Lam. The Stanford DASH multiprocessor. IEEEComputerJ., 25(3), pp 63-79, March 1992.5D. Lioupis, N. Kanellopoulos, CHESS Multiprocessor: AProcessor-MemoryGrid for Parallel Programming. In Cache and Interconnect Architecturesin Multiprocessors, edited by M. Dubois & T. Thakard, KluwerAcademicPublishers, pp245-257, June 1989. 6D. Lioupis, N. Kanellopoulos, M. Stefanidakis, The Memory Hierarchyof the CHESS Computer. Microprocessing and Microprogramming (JSA)(vol 38), pp 99-107, Sept 1993. 7D. Lioupis, A. Pipis, M. Smirli, N. Kanellopoulos, Architecture ofa VSM Parallel Video-on-demand server. Appeared in the specialsessionon Communication and Computing for Distributed Multimedia Systems of the10th International Conference on Parallel and Distributed Computing andSystems, Las Vegas, Nevada, October 28-31, 1998 8D. Lioupis, M. Stefanidakis, Dynamic Load Balancing on aVirtually-SharedMemory Parallel Computer System. In Proceedings of the 6thInternationalPARLE Conference, pp813-819, Athens, July 1994. 9D. Lioupis, A. Pipis, M. Stefanidakis, PiSMA: An Upgradeable FaultTolerant Approach to Parallel Processing. In Proceedings of the 4thIEEE International Conference on High Performance Computing,pp277-283,Bangalore India, Dec 1997.10C. Thaker, L. Stewart, Firefly: A Multiprocessor Workstation. InProceedingsof the 2nd International Conference on Architectural Support forProgrammingLanguages and Operating Systems, ACM, October 198711D. H. D. Warren and S. Haridi, The Data Diffusion Machine - AScalableShared Virtual Memory Multiprocessor, In Proceedings of the 1988 Int.Conf. on Fifth Generation Computer Systems, pp 943-952, Tokyo, Japan, Dec. 1988 Dimitris Lioupis (lioupis@cti.gr)is an Assistant Professorat the Department of Computer Engineering and Informatics, Universityof Patras, Greece. Andreas Pipis (pipis@cti.gr), Maria Smirli (smirli@cti.gr) and Michael Stefanidakis (mistral@cti.gr) are PhD students at the Department of Computer Engineering and Informatics,University of Patras, Greece.Want more Crossroads articles about Computer Architecture?Go to the
 

Parallel

vIrtually

Shared

Memory

Architecture;

merges

benefits

of

shared

and

distributed

memory

models,

secures

confidential

communication

via

authentication.

[ACM

Crossroads]

http://www.acm.org/crossroads/xrds5-3/pisma.html

PiSMA: A Parallel VSM Architecture 2008 September

dvd rental

dvd


Parallel vIrtually Shared Memory Architecture; merges benefits of shared and distributed memory models, secures confidential communication via authentication. [ACM Crossroads]

Rules




© 2008 Internet Explorer 5+ or Netscape 6+

Recommended Sites: 1. Arts - Business - Computers - Games - Health - Home - Kids and Teens - News - Recreation - Reference - Regional - Science - Shopping - Society - Sports - World Miss Gallery - Top Anime Hentai - DVD rental by mail - Teen Chat - Pay Day Loans - Finance - Debt Consolidation - Debt Management
2008-09-05 13:12:40

Copyright 2005, 2006 by Webmaster
Websites is cool :) 94Budownictwo, Firmy - Skip Hire - Hotel Bruksela - Głośniki Car Audio Kable - Polish Pottery