Legion; commercial pedestrian simulation

Chapter 4.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis

Legion is the company behind the commercial pedestrian simulation software, Legion Studio and its accompanying 3D visualisation software, Legion 3D. Both are used worldwide to optimise the design and operation of public spaces. Such spaces typically include: transport terminals; sport, entertainment and leisure venues; shopping centres; commercial and public buildings; and venues for major international events like the Olympics.

Their global portfolio includes key organisations in the fields of transport, major events, sports, urban development and government. Legion software is used by many of the leading rail and transit agencies and has been deployed for each Olympic Games from Sydney 2000 right up to London 2012. Legion simulations are also used in many urban developments around the world. Designers, planners, engineers and asset managers have used Legion software and services to evaluate and optimise public spaces in improving safety, efficiency and profitability.

Their customers benefit greatly from the fully validated analyses and visualisations that the software produces. These outputs are used to attain considerable economic benefits for facilities and programmes. Additionally, Legion software and services can improve the efficiency of projects; streamline the decision making process; ensure security; improve risk management and enhance profitability. And all this is achieved with the optimal pedestrian experience in mind.

Legion’s patented simulation technology is the result of many years’ inter-disciplinary research into pedestrian behaviour. The accuracy of the simulations has been independently tested against real-world data resulting in endorsements by, among others: Crossrail, London Fire Brigade, London Underground and Santiago Metro.

Legion Limited is a UK company that designs, develops and markets a market-leading pedestrian simulation software. The company has a keen interest in advancing its science and technology to maintain its competitive edge. Industry trends suggest a continued move towards multiple CPU personal computers. The development of a multi-threaded version of the Legion simulation software is the only way to harness the power of commodity hardware. In addition, distributed computing is an indispensable tool for tackling simulations of ever increasing size and complexity. This research aims to produce state-of-the-art and commercially desirable output.

This chapter describes the development of the multi-threaded and distributed version of the software and presents benchmark results demonstrating how the use of a multicomputer or of even a multi-core computer can greatly accelerate the speed of a pedestrian movement software.

4.1 Introduction

The Legion Studio software suite[1] is a widely adopted, powerful and accurate pedestrian simulation software. It comprises of three applications: the Model Builder, the Simulator and the Analyser. In combination, these applications enable you to simulate pedestrian movement within a defined space, such as a railway station, sports stadium, sports park, airport, tall building, piazza, transport hub, town centre or any place that people assemble in or move through.

The software simulates the behaviour and movement of pedestrians footstep-by-footstep[1] calculating how individuals interact with each other and with the physical obstacles in their environment. The simulations employ a microscopic simulation model[2], which treats space as continuum, using spacial objects, such as entrances, exits and escalators, to define space utilisation. The simulation navigates entities on the ‘least-effort’ principle. Each entity chooses its next step in an effort to find the best compromise between directness of path, speed and comfort.

The Model Builder can be used to create an accurate model of the space that we want to simulate. In the Model Builder we can perform the following:

• Import architectural drawings (CAD) that define the physical space.

• Specify the pedestrian demand imposed on the space.

• Designate areas where activities such as queuing or waiting occur.

• Account for different routes.

• Link operational data to the model.

• Export model files for use in the Simulator.

The Simulator can be used to run a simulation of how pedestrians move or circulate within the space defined in the Model Builder. In the Simulator we can perform the following:

• Import model files.

• Playback and view the simulation.

• Record appropriate parts of the simulation as a ‘results file’ (.res) to be analysed.

• Record all or appropriate parts of the simulation as a video file for presentations.

The Analyser can be used to run a series of analyses on our simulated space. In the Analyser we can perform the following:

• Import results files and model files.

• Play back selected parts of a recorded simulation, or run a new simulation (like in the Legion Simulator).

• Visualise key metrics in the form of maps.

• Run detailed analyses and display the results as time series, stacked bars or histograms.

• Export the analysis session as graphs, results files, video, pictures or tables for inclusion in presentations, reports and spreadsheets.

Using Legion Studio, we can perform simulations on the design or operation of a space and assess the impact of different physical designs or levels of pedestrian demand. We can study the impact of chance events, such as the closure of an exit or late arrival of a train and test different evacuation scenarios for speed and safety. The latter can prove vital for compliance with increasingly rigid safety regulations. Legion simulation solutions are well suited for various stages of projects:

• Capital Planning

During the strategic planning or capital planning process is where, economically, the software, data and analysis can have the biggest impact by evaluating early in the process where you need to spend money and where you don’t, enabling you to maximise cost savings at the earliest stage.

• Design Phase

During the design phase for a facility design or refurbishment, you can minimise design iterations or alternatives by analysing and comparing potential designs before too much time has been spent on flushing out design options. This can help shorten the overall design phase by efficiently removing options with data and analysis. Additionally by evaluating a design, you can optimise the design now and avoid costly design changes downstream during the build out.

• Construction Phase

Construction in transit, aviation, stadiums or rail stations as part of an upgrade to the infrastructure is a common occurrence. The agency wishes to maximise the available space for construction and material staging while remaining open to the public with minimal service interruptions. Maximising the speed of construction while accommodating the pedestrian demand is a difficult balancing act. By modelling the proposed construction phasing plan the guess work is taken out of the process. Decisions regarding how much and where to close can be made with facts on what the outcome will be of the different construction staging and operations plans.

• Daily Operations & Operations Planning

Streamline daily operations by identifying more efficient designs or layouts which can drive better pedestrian flow without the need for added personnel or temporary barriers. Compare and analyse various operational procedures and traffic demands to help a venue reach and maintain optimum operational efficiencies. In the Sports Arena & Special Events situations, simulations can help to identify improvements to pedestrian flow without disrupting existing operations. Train Operations: In the Train Sector, Legion Software can be used to manage various aspects of Train Operations which includes train car selection and fit out as well as assessment of timetable efficiency and performance optimisation. At any stage of operations you can use Legion Simulation to assess and optimise your train schedules and train car capacities.

• Safety and Security assessment

Every rail & metro station, football stadium and airport requires an annual safety certificate. Commercial buildings need to test evacuation scenarios. Every major event needs to establish evacuation and contingency plans. Design, simulate and stress test safety measures in an efficient and timely manner. Simulate alternative evacuation scenarios where the key variables are modified so that you can see all results and eliminate the guess work. Safety and security plans can be designed based on clear assessment of risk, calculated predictions thus removing a lot of the guess work and lowering the overall risk associated with security or safety issues.

4.2 Legion Analyser

The Legion Analyser enables us to set up and run a series of rich, user-defined, analyses on our simulation using two methods:

• On-line analysis – analysing while simulating (using an .ora file).

• Off-line analysis – analysing a recorded simulation (using a .res file).

Both methods give access to a wide range of metrics, such as density, speed, flow, journey time and dissatisfaction, and a rich array of display methods and outputs including maps, graphs, tables and raw data. In the Legion Analyser a user can import data and model files, playback all or selected parts of the simulation, track individual entities and visualise their walking paths over time, visualise key metrics in the form of colour-coded maps, analyse any area of the model and display the results as time series, stacked bars or histograms and finally, produce results files, video, pictures or data for presentations, reports and spreadsheets.

Read also  The History Of Microsoft Powerpoint And Word Computer Science Essay

The Legion Analyser creates an analysis (.ana) file as a template for storing the settings of all maps, graphs and analyses generated from an .ora file or the simulation’s .res file. In this way, we can analyse many files using the same analysis template, which is a good way to compare different scenarios.

The Legion Analyser enables us to take the whole model, or a defined portion of it, and ask certain questions. The following schema details the kind of objectives that typically relate to the four main concerns that Legion analyses relate to:

• Feasibility studies.

• Design and construction.

• Renovation.

• Operations.

The following is a sample of the types of questions we can ask and get an answer using Legion analyses:

• Will the venue cope with projected demand?

• What are the density levels at bottleneck points such as the bottom of stairs, main entrances or stadium vomitories?

• What is the average waiting time at facilities during peak periods?

• Can the venue be evacuated safely in the case of an incident?

• What is the interchange time distribution between lines A, B and C?

4.2.1 Maps and Value Ranges

Legion Analyser maps provide colour-coded representations of the simulation we are analysing, enabling us to visualise key Entity experience and crowd dynamic metrics such as density and space utilisation. They are really good for obtaining an overview of a scheme’s performance and we can apply them to the whole of model or restrict them to specific areas defined by Analysis Zones.

The colours displayed in a Legion map are linked to two types of range:

• Value ranges – essentially these are Levels of Service (such as those defined by J. Fruin[REF] or the US Highway Capacity Manual[REF]) used to rate experience-metrics.

• Colour ranges – an ordered list of colours used to describe local conditions that typically range from “excellent” (blue) to “bad” (red).

Colours within a map can represent the following:

• Occupancy – the number of Entities inside an area.

• Anything that can be used to measure Entity experience – examples include speed achieved, density experienced and total distance covered by Entities inside an area.

• Time – the duration inside an area for which a pre-set condition on occupancy or on any Entity experience metric has been met.

The Legion Analyser provides several default maps but we can also create our own using default or custom value and colour ranges.

4.2.2 Standard Maps

The following standard maps are available within the Analyser, on the Maps menu:

• Cumulative High Density

• Cumulative Max Density

• Cumulative Mean Density

• Cumulative Min Density

• Evacuation

• Space Utilisation

Descriptions of each map and their typical uses follow.

Cumulative High Density Map

This map shows how long various areas of a site have registered densities greater than a specified limit. The range of colours represent time. The map is similar to a “temperature” map: areas that have experienced high levels of density for a long time appear red, those that have experienced shorter periods of density appear blue.

This map is best used for identifying “hot-spots” within a site: areas where high levels of density are sustained. It asks the questions “is this design creating persistently uncomfortable crowd densities?” and “should it be altered to alleviate these problems?”

Cumulative Max/Mean/Min Density Map

These maps display the maximum, mean and minimum levels of density registered in an area from the beginning of playback to t he current moment. They are generally used in combination with value ranges corresponding to widely used Levels of Service (Fruin’s, USHCM, etc.).

They are best used for measuring the performance of a site against predetermined standards or imperatives such as “the average density within a unit of space must not exceed Fruin’s Level of Service x”.

Evacuation Map

Evacuation Maps represent the amount of time that has elapsed from the beginning of playback to the most recent moment when an area was occupied. They are useful for safety assessments such as a train on fire or a station on fire, and egress assessments such as time to clear a stadium, as illustrated in Figure 5, or office building. They can also be used for platform capacity assessments, to show how quickly platforms clear following the arrival of a train.

Space Utilisation Map

The Space Utilisation Map reveals how much space within a site is being used. It records the location of every step of each Entity over the duration of the simulation. Heavily used areas are coloured red and lightly used areas are coloured blue. Areas of the simulation that are not used at all remain white.

The colour range represents the amount of time a unit of space has been occupied within the simulation. The default setting of this unit of space is 10x10cm. This map is best used for illustrating which areas of a site are used the most and the least. It can support questions such as “if this area is not being used regularly, could it be used for a small kiosk or retail unit?”.

4.3 Multi-Threaded Legion Analyser

The following sections describe in depth the design, the implementation and the benchmark results of the Multi-Threaded version of the Legion Analyser commercial software.

4.3.1 Design

In this section we describe the requirement that shaped the design of the multi-threaded version of the Legion Analyser.

Objectives

The main objective for re-developing the Legion Analyser is to provide a faster, maintainable release. Industry trends suggest a continued move towards multiple CPU personal computers. The development of a multi-threaded version of the Legion simulation analysis software is the only way to harness the power of commodity hardware.

The main beneficiaries of this activity were the users who have come to rely on the functionality that the Legion Analyser provides. The increased performance was a benefit to them and to new users. In addition, one of the major considerations when redesigning the Legion Analyser was to make maintenance and support easier for the developers of Legion.

Architecture

The most important components of the Legion Analyser are shown in Figure 6. The class CReSpaceMapManager is responsible for the list of the enabled maps, for their metrics and for their implementation. The CCellStorageManager class is responsible for the accumulation and for the identification of the data of the cells. The storage is a grid full of CCellStorageData class pointers. The CCellStorageData contains a vector of CCellStorageDataItem, one item for each map.

The major components of the Cell Accumulation and Identification classes are being illustrated in Figure 7. These classes are responsible for resolving the list of affected cells, stepped by the entities, computing those affected cells and then accumulating them.

The Statistics and the Entity Map Manager classes are being illustrated in Figure 8. The Statistics Manager is responsible for the statistics of the Legion Analyser, keeping a track of the running time of the enabled analysis and of the statistical metrics. The Entity Map Manager is responsible for the handling and for the modifying of the entity maps.

4.3.2 Implementation

An analysis session comprises of the following tasks:

• Advances the simulation time clock.

• Loads entity list from a ROOT[3] file.

• Calculates the maps by traversing a grid-like structure gathering information from nearby entities[2].

• Renders the maps and the entity movement.

• Computes analyses by traversing a list of analyses.

• Updates the graphs and saves any files that need saving.

The maps are the collection of objects that take care of accumulating various metrics from the entities as they move across the usable space. They are responsible for:

• Internal abstract representation[3].

• Internal memory structure.

• Algorithms needed to identify the space that is stepped on.

• Algorithms needed to accumulate entities metric as they move.

The Multi-threaded Analyser creates a thread pool with a size equal to the total number of the enabled maps for calculation. The use of a thread pool is proved to be faster than native threads since there is no thread creation and destruction overhead[4]. There is no essential dependency or communication between the parallel tasks since a communication overhead reduces the speed up achievable by the programme. There are no invalid pointers since iterators are invalidated during insertions and removals. The use of Critical Sections to lock the critical region of the OpenGL drawing procedure of the maps was faster than the use of a simple mutex or of a recursive mutex. Listing 1 contains the pseudo-code of the process.

Read also  Mathematical and Physics Concepts in Computer Games

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

1. Create a thread pool according to the number of the enabled maps

2. For each simulation time step

a. Get the entity list

b. Traverse the entity list from the beginning to the end or vice versa

c. Lock the openGL drawing procedure

d. Wait for the other thread(s) to finish calculating the time step

e. Remove the lock and draw the maps on the screen

f. Advance to the next simulation time step

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 1. The pseudo-code of the multi-threaded Analyser.

The sequence of the actions performed in an off-line Legion analysis can be seen in Figure 9 and the sequence of the actions performed in an on-line Legion analysis can be seen in Figure 10.

The only difference between the on-line and the off-line analysis is that during the on-line analysis, the Analyser communicates with the Simulator using the Simulation Wrapper class.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

The detection of the number of the processors or of the cores in a machine is being illustrated in Listing 2.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

// CLASS CReSpaceMapManager

CReSpaceMapManager::CReSpaceMapManager()

: m_threadPool( )

{

//

// Detect the number of processor in the machine, and set it as the

// default value for the processor property

//

SYSTEM_INFO systemInfo;

::GetSystemInfo( &systemInfo );

// NOTE: the defaut pool is fifo

m_threadPool.size_controller().resize( systemInfo.dwNumberOfProcessors );

}

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 2. The detection of the total number of processors or of the cores in a machine.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

The execution of a thread for each enabled map is being illustrated in Listing 3.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

// Function DoCheckWin

void CReSpaceMapManager::DoCheckWin( void )

{

// Get the entities from the entity manager

Legion::Simulator::IEntityPtrVector& entities =

CLegnResEntityDataManagerBase::GetInstance()->GetCurrentEntityList();

MapList::iterator iter( m_mapList.begin() );

MapList::iterator end( m_mapList.end() );

while( iter != end )

{

const COdbSpaceCentricMap* pSpaceMap = dynamic_cast<const COdbSpaceCentricMap*>( (*iter)->GetMap() );

// Only do calculations for enabled maps

if( pSpaceMap->IsEnabled() )

{

CReSpaceMapManagerItem* pSpaceMapItem = dynamic_cast<CReSpaceMapManagerItem*>(*iter);

ASSERT( pSpaceMapItem );

// Check for the reset interval

int nResetInterval = pSpaceMap->GetResetInterval();

if (nResetInterval != COdbSpaceCentricMap::MapResetDisabled)

{

double timeStamp = ClegnResEntityDataManagerBase::GetInstance()- >GetStopWatch().GetTime().GetTimeSecond();

double rIntervals = double(int(timeStamp / double(nResetInterval)));

// stopwatch keeps time-step interval in milliseconds

double timeTolerance = ClegnResEntityDataManagerBase::GetInstance()- >GetStopWatch().GetTimeStepInterval() / 1000.0;

if( timeStamp – rIntervals*nResetInterval < timeTolerance )

{

ResetMap( pSpaceMap );

}

}

// Execute a thread

m_threadPool.schedule( SpaceMapTask( pSpaceMapItem, entities ) );

}

++iter; // increase the iterator of the map list

}

// Join the thread pool to wait for all the maps to finish the computation

if( !m_threadPool.empty() )

{

m_threadPool.wait();

}

} // End of DoCheckWin function

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 3. The detection of the total number of processors or of the cores in a machine.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

4.3.3 Performance

The memory footprint of the programme has been reduced to the minimum with the use of associative vectors instead of using maps of vectors. The associative vector is a std::map look-alike that uses a sorted vector for storage and such a choice has the advantage of fast binary searches but slow insertions and removals. Iterators are invalidated during insertions and removals, which doesn’t happen with std::map’s node based storage. The Associative Vector is faster than std::set/map in lookups and more memory friendly, specially for small types, since normally a tree like structure imposes an overhead of 3 pointers and an integer per node. That without counting that memory allocation for a vector has far less fragmentation when using std::allocator.

The memory management has been optimised by changing and tweaking the structure of the programme. As a result, a lot of unnecessary search procedures at every simulation time step have been removed. The programme uses the same amount of memory as the original single-threaded version in most of the models and in case that the programme uses more memory, the increase is only between 3% to 6%. To benchmark our multi-threaded implementation, we have used six models with different levels of complexity and size. The increase in the performance depends on the size and complexity of the model. All our used models are available in Appendix XXX.

In Table 1, a 55.43% increase in performance and a 3.16% increase in memory usage is being illustrated using a small-sized model with 350 entities.

TABLE 1. Small-sized model. Name: PM Peak. 350 Entities. Simulation time: 3 Hours.

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

00:39:45

00:17:43

Memory Usage in MB

190

196

Peak CPU Usage

50.00%

75.00%

In Table 2 the increase was 34.47% and with a 3.51% increase in memory usage using our second small-sized model with 552 entities.

TABLE 2. Small-sized model. Name: UP Demo v3:1. 552 Entities. Simulation time: 1 Hour.

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

00:07:50

00:05:08

Memory Usage in MB

114

118

Peak CPU Usage

50.00%

75.00%

In Table 3, an increase of 57.77% in the performance and a small decrease of 0.82% in memory usage is being presented using a medium-sized model with 1200 entities.

TABLE 3. Medium-sized model. Name: Gatwick Airport Station Redevelopment. 1200 entities. Sim time: 1 Hour.

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

00:22:32

00:09:31

Memory Usage in MB

245

243

Peak CPU Usage

50.00%

88.00%

Likewise, In Table 4, an increase of 65.50% in performance and a 5.93% increase in memory usage is being illustrated using a medium-sized model with 2500 entities.

TABLE 4. Medium-sized model. Name: New WTC Model. 2500 entities. Simulation time: 1 Hour and 30 Mins

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

01:41:22

00:34:58

Memory Usage in MB

489

518

Peak CPU Usage

50.00%

85.00%

In Table 5, an increase of 34.15% in performance can be seen in Table 3 together with a 6.38% decrease in memory usage using a large-sized model with 51000 entities. Likewise, in Table 6, an increase of 32.19% in performance and a decrease of 1.34% in memory usage is being illustrated using a large-sized model with 52000 entities.

TABLE 5. Large-sized model. Name: London Olympic Park 2012. 51000 entities. Simulation time: 14 Mins.

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

02:16:25

01:29:50

Memory Usage in MB

940

880

Peak CPU Usage

50.00%

99.00%

TABLE 6. Large-sized model. Name: HOS Case3. 52000 entities. Simulation time: 19 Mins.

Metrics

Original

Multi-threaded

Total Time HH:MM:SS

01:25:04

00:57:41

Memory Usage in MB

373

368

Peak CPU Usage

50.00%

98.00%

The performance gained and the memory usage can be seen in Figure 11. The performance increase ranges between 35% to 65.5% compared to the original single-threaded Legion Analyser on a dual core system[4].

4.4 Distributed Legion Analyser

The following sections describe the design, the implementation and the benchmark results of the prototype version of the Distributed version of the Legion Analyser commercial software.

4.4.1 Design and Implementation

In this section we describe the requirements that shaped the design of the prototype distributed version of the Legion Analyser.

Objectives

The main objective for developing a distributed version of the commercial programme is to provide a system capable of tackling simulations of ever increasing size and complexity. This work aims to demonstrate how the use of a multicomputer can greatly accelerate the speed of a pedestrian movement software.

The main beneficiaries of this research work were the developers of Legion. The demonstration of the increased performance was a benefit to them and to their customers.

Architecture

In the early stages of the development of the Distributed Analyser, the OpenMP standard was considered but such an option was abandoned because OpenMP is limited to be used in a shared-memory environment, i.e. a shared memory cluster[5]. Since we wanted to use the Distributed version of the programme in a network using workstations in a distributed-memory environment, the Message Passing Interface (MPI) library was used to send messages between the nodes and across the network. MPI is the most popular message-passing library standard for parallel programming[6]. The MPICH2 implementation of the version 2.1 (MPI-2) of the standard was chosen together with the Boost.MPI library, part of the Boost C++ library. The Boost.MPI library provides a C++ friendly interface to the MPI standard that better supports modern C++ development styles[7].

Read also  Advantages And Disadvantages Of Different OS Computer Science Essay

The prototype of the Distributed Legion Analyser consists of the Master node and the Slave nodes as illustrated in Figure 12. The Master node is responsible for collecting the results from the Slave nodes, drawing the results on the screen and updating the statistics and the graphs. The Slave nodes are responsible for all the calculations of the maps. The work is divided and evenly distributed between the Slave nodes and a load balancing algorithm makes sure that no Slave node will be idle for a long period of time.

All the nodes open a read-only model on the network and begin the Distributed Analysis. The division of the work is done according to the total nodes registered and the total maps enabled for the analysis session. Each node is registered and a list of all the available nodes exists on the MPI_COMM_WORLD. The map list and the entity list is then fetched together with the list of the computers registered in the MPI_COMM_WORLD. Hence, every node is aware of all the registered nodes taking part in the analysis.

Each registered Slave node starts the calculation of the assigned maps and at every simulation time step, it calculates the assigned maps, serialises the results, packs them using MPI_Pack(), sends them to the Master node using a non-blocking MPI_ISend() and waits for all the other Slave nodes to finish the calculation before advancing to the next time step. The Master node collects the results using MPI_IRecv(), unpacks the packed data using MPI_Unpack(), draws and displays the maps on the screen and updates the graphs and the statistics.

The following pseudo-C++ codes illustrate the use of the MPI for the communication and the division of the work between the nodes. Listing 4 illustrates the initialisation of the MPI communication library.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

/// Initialise MPI

MPI_Init( NULL, NULL );

// Boost.MPI code

mpi::environment env (NULL,NULL);

mpi::communicator world;

int mynode, totalnodes;

//MPI_Status status;

// Assign a rank to each available node

MPI_Comm_rank( MPI_COMM_WORLD, &mynode );

// Get the total size of the available nodes

MPI_Comm_size( MPI_COMM_WORLD, &totalnodes );

TRACE( “Hello world from process: %d of %dn”, mynode, totalnodes ); // debugging msg

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 4. The Initialisation of MPI

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 5 illustrates the division of the work for six Slave nodes.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

//// Split the jobs according to the size of totalnodes

int start, workEnd;

int node1End, node2End, node3End, node4End, node5End;

switch (mynode)

{

case 1: // 1st worker node

start = 1;

workEnd = mapSize * mynode / (totalnodes-1);

advance(iter, workEnd);

break;

case 2: // 2nd worker node

node1End = mapSize * (mynode-1) / (totalnodes-1);

start = node1End+1;

workEnd = mapSize * mynode / (totalnodes-1);

break;

case 3: // 3rd worker node

node2End = mapSize * (mynode-1) / (totalnodes-1);

start = node2End+1;

workEnd = mapSize * mynode / (totalnodes-1);

break;

case 4: // 4th worker node

node3End = mapSize * (mynode-1) / (totalnodes-1);

start = node3End+1;

workEnd = mapSize * mynode / (totalnodes-1);

break;

case 5: // 5th worker node

node4End = mapSize * (mynode-1) / (totalnodes-1);

start = node4End+1;

workEnd = mapSize * mynode / (totalnodes-1);

break;

case 6: // 6th worker node

node5End = mapSize * (mynode-1) / (totalnodes-1);

start = node5End+1;

workEnd = mapSize * mynode / (totalnodes-1);

break;

default: // for Root (id=0) – just some debugging msg..

TRACE (“Hello from root”);

break;

}

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 5. Work division for 6 slave nodes.

Most of the communication between the Slave and the Master nodes is being illustrated in the following code listings. Each Slave node calculates a map in a separate thread and then sends the results back to the Master node as illustrated in Listing 6.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

// IF we have 6 enabled maps & 1 master + 6 cluster nodes then every node will do

// calculations for just one map otherwise work will be divided by totalnodes size.

if (mynode != 0) // workers – sender code

{

MapList::iterator iter(advance(m_mapList.begin(),start));

advance(iter, start); // Beginning of the allocated work for earch worker

MapList::iterator end( m_mapList.begin() ); // Actually it’s the beginning…

advance(end, workEnd); // But now it’s the end of the allocated work for each worker

while( iter != end )

{

const COdbSpaceCentricMap* pSpaceMap = dynamic_cast<const COdbSpaceCentricMap*>( (*iter)->GetMap() );

// Only do calculations for enabled maps

if( pSpaceMap->IsEnabled() )

{

CReSpaceMapManagerItem* pSpaceMapItem = dynamic_cast<CReSpaceMapManagerItem*>(*iter);

ASSERT( pSpaceMapItem );

// Execute the thread

m_threadPool.schedule( SpaceMapTask( pSpaceMapItem, entities ) );

}

++iter;

}

// Join the thread pool as to wait for all the maps to be finished computing

if( !m_threadPool.empty() )

{

m_threadPool.wait();

}

// Call the serialisation & MPI comm function

m_cellStorageManager->SerialiseMe();

}

Listing 6. Sender Code: Each Slave node calculates a map and then sends the results to the Master node.

The Master node collects the results, unpacks them and calls the drawing function to draw the results on the screen as illustrated in Listing 7.

else // root – Receiver code

{

// Get the data, unpack them (if serialised), draw the results (call the draw function)

int wSlave;

// Use a loop to get all the results from all the nodes (equal to totalnodes)

// then unpack them and call the drawing function

wSlave = totalnodes – 1; // wSlave is equal to the total no of nodes minuss the root node

while (wSlave > 0) { // 0 is the root

int position = 0;

MPI_Recv(packbuf, packsize, MPI_PACKED, &a, 1, MPI_INT, MPI_COMM_WORLD, &status);

MPI_Unpack( packbuf, packsize, &position, &a, 1, MPI_INT, MPI_COMM_WORLD );

MPI_Unpack( packbuf, packsize, &position, &b, 1, MPI_DOUBLE,MPI_COMM_WORLD );

// call drawing function here

wSlave–;

}

}

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

Listing 7. Receiver Code: The Master node collects the results and calls the OpenGL drawing function.

Multi-Threaded and Distributed Framework for Pedestrian Simulation Analysis 1

4.4.2 Performance

To benchmark our distributed implementation, we have used an evacuation case study. The area is modelled after the London 2012 Olympic Park and we have populated the model with 56500 entities. The model is available in Appendix XXX.

We have benchmarked our prototype distributed implementation on commodity hardware connected by a gigabit Ethernet switch. Figure 13 illustrates the performance of the distributed programme in terms of the time it takes in seconds to analyse a simulation second as a function of the number of the Slave processors.

The performance scales well as the number of processors is increased. With 6 Slave processors the prototype system is able to analyse 56500 simulated pedestrians in just 3.8 seconds.

4.5 Summary

We have faced many challenges and obstacles during this research project, mainly due to the difficulty of understanding the existing code of the Legion Studio software suite, a 6 GB code with more than 26,000 C++ files but mostly due to the company “secrets” and Intellectual Property (IP).

This chapter presented the requirements and implementation of the Legion Analyser commercial programme. We have developed a framework capable of analysing the simulation data produced by the commercial Legion Studio pedestrian simulation software. The programme has been implemented as a multi-threaded and as a distributed programme written in C++ with calls to the MPI library.

Benchmarking the programme on a dual-core PC and on a commodity cluster of high performance PCs demonstrated the system’s increase in performance compared to the original single-threaded analyser. The performance scales well as the number of processors is increased.

References

1. Legion Studio Software Suite, http://www.legion.com

2. J. L. Berrou, J. Beecham, P. Quaglia, M. A. Kagarlis, A. Gerodimos, “Calibration and Validation of the Legion Simulation Model using Empirical Data”, Pedestrian and Evacuation Dynamics 2005, Springer Berlin Heidelberg, pp. 167-181.

3. F. Rademakers, R. Brun, “ROOT: An Object-Oriented Data Analysis Framework”, Proceedings AIHENP’96 Workshop, Nucl. Inst. Meth. In Phys. Res. A389 (1997) pp. 81-86, Lausanne. See also: http://root.cern.ch

4. Y. Ling, T. Mullen, X. Lin, “Analysis of Optimal Thread Pool Size”, ACM SIGOPS Operating Systems Review, v.34 n.2, pp. 42-55, April 2000.

5. L. Dagum, “OpenMP: A proposed industry standard API for Shared Memory Programming”, http://www.openmp.org, Technical Report, October 1997.

6. M. J. Quinn, “Parallel Programming in C with MPI and OpenMP”, McGraw-Hill, New York, NY, 2004.

7. P. Kambadur, D. Gregor, A. Lumsdaine, A. Dharurkar, “Modernizing the C++ interface to MPI”, Proceedings of the 13th European PVM/MPI Users’ Group Meeting, LNCS, pp. 266-274, Bonn, Germany, September 2006, Springer.

[1]In a quantitatively verifiable manner.

[2]The maps are generated from the entities by adding their contribution to the map.

[3]Needed for generic rendering.

[4] Using a dual core PC. 2GHz of CPU and 2GB of RAM.

Order Now

Order Now

Type of Paper
Subject
Deadline
Number of Pages
(275 words)