2024-03-28T09:05:34Zhttps://www.tdx.cat/oai/requestoai:www.tdx.cat:10803/1295762017-08-29T12:44:20Zcom_10803_183col_10803_196
TDX (Tesis Doctorals en Xarxa)
author
Tejedor Saavedra, Enric
authoremail
ENRIC.TEJEDOR@BSC.ES
authoremailshow
false
director
Badia Sala, Rosa M. (Rosa Maria)
authorsendemail
true
2014-02-03T13:17:53Z
2014-02-03T13:17:53Z
2013-07-15
http://hdl.handle.net/10803/129576
B 5047-2014
The last decade has witnessed unprecedented changes in parallel and distributed infrastructures. Due to the diminished gains in processor performance from increasing clock frequency, manufacturers have moved from uniprocessor architectures to multicores; as a result, clusters of computers have incorporated such new CPU designs. Furthermore, the ever-growing need of scienti c applications for computing and storage capabilities has motivated the appearance of grids: geographically-distributed, multi-domain infrastructures based on sharing
of resources to accomplish large and complex tasks. More recently, clouds have emerged by combining virtualisation technologies, service-orientation and business models to deliver IT resources on demand over the Internet.
The size and complexity of these new infrastructures poses a challenge for programmers to exploit them. On the one hand, some of the di culties are inherent to concurrent and distributed programming themselves, e.g. dealing with thread creation and synchronisation, messaging, data partitioning and transfer, etc. On the other hand, other issues are related to the singularities of each scenario, like the heterogeneity of Grid middleware and resources or the risk of vendor lock-in when writing an application for a particular Cloud provider.
In the face of such a challenge, programming productivity - understood as a tradeo between programmability and performance - has become crucial for software developers. There is a strong need for high-productivity programming models and languages, which should provide simple means for writing parallel and distributed applications that can run on current infrastructures without sacri cing performance.
In that sense, this thesis contributes with Java StarSs, a programming model and runtime system for developing and parallelising Java applications on distributed infrastructures. The model has two key features: first, the user programs in a fully-sequential standard-Java fashion - no parallel construct, API call or pragma must be included in the application code; second, it is completely infrastructure-unaware, i.e. programs do not contain any details about deployment or resource management, so that the same application can run in di erent
infrastructures with no changes. The only requirement for the user is to select the application tasks, which are the model's unit of parallelism. Tasks can be either regular Java methods or web service operations, and they can handle any data type supported by the Java language, namely les, objects, arrays and primitives. For the sake of simplicity of the model, Java StarSs shifts the burden of parallelisation from the programmer to the runtime system. The runtime is responsible from modifying the original application to make it create asynchronous
tasks and synchronise data accesses from the main program. Moreover, the implicit inter-task concurrency is automatically found as the application executes, thanks to a data dependency detection mechanism that integrates all the Java data types.
This thesis provides a fairly comprehensive evaluation of Java StarSs on three di erent distributed scenarios: Grid, Cluster and Cloud. For each of them, a runtime system was designed and implemented to exploit their particular characteristics as well as to address their issues, while keeping the infrastructure unawareness of the programming model. The evaluation compares Java StarSs against state-of-the-art solutions, both in terms of programmability and performance, and demonstrates how the model can bring remarkable productivity to programmers of parallel distributed applications.
eng
Programming and parallelising applications for distributed infrastructures
info:eu-repo/semantics/doctoralThesis info:eu-repo/semantics/publishedVersion
URL
https://www.tdx.cat/bitstream/10803/129576/1/TETS1de1.pdf
File
MD5
e859040a9f94f6d7fbf3d7dd69be8ebe
6454307
application/pdf
TETS1de1.pdf
URL
https://www.tdx.cat/bitstream/10803/129576/5/TETS1de1.pdf.txt
File
MD5
053fe5db183a04fc1da110818ce4a5ca
461681
text/plain
TETS1de1.pdf.txt