Limpar
6.143 resultados

Acesso aberto

Tipo do recurso

Ano de criação

Produção nacional

Revisado por pares

Áreas

Idioma

Editores

Artigo Acesso aberto Revisado por pares

Bowen Meng, Guillem Pratx, Lei Xing,

... CBCT/CT using a parallel computing framework called MapReduce. We show the utility of MapReduce for solving large‐scale medical physics problems in ... by porting it to Hadoop, an open‐source MapReduce implementation. Gated phases from a 4DCT scans were reconstructed independently. Following the MapReduce formalism, Map functions were used to filter and ... aggregate those partial backprojection into the whole volume. MapReduce automatically parallelized the reconstruction process on a large ... mean square error between the images obtained using MapReduce and a single‐threaded reference implementation was on ...

Tópico(s): Advanced Radiotherapy Techniques

2011 - Wiley | Medical Physics

Artigo Revisado por pares

Saba Sehrish, Grant Mackey, Pengju Shang, Jun Wang, John Bent,

... performed before existing data-intensive tools such as MapReduce can be used to analyze data. This reorganization ... the data set and then at least one MapReduce program to prepare the data before analyzing it. Running multiple MapReduce phases causes significant overhead for the application, in ... excessive I/O operations. That is for every MapReduce phase, a distributed read and write operation on ... be performed. Our contribution is to develop a MapReduce-based framework for HPC analytics to eliminate the ...

Tópico(s): Distributed and Parallel Computing Systems

2012 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Artigo Acesso aberto Revisado por pares

Sebastian Schönherr, Lukas Forer, Hansi Weißensteiner, Florian Kronenberg, Günther Specht, Anita Kloss‐Brandstätter,

Abstract Background The MapReduce framework enables a scalable processing and analyzing of large datasets by distributing the computational load on connected computer nodes, referred to as a cluster. In Bioinformatics, MapReduce has already been adopted to various case scenarios ... genotype files. Nevertheless, tasks like installing and maintaining MapReduce on a cluster system, importing data into its distributed file system or executing MapReduce programs require advanced knowledge in computer science and ...

Tópico(s): Gene expression and cancer classification

2012 - BioMed Central | BMC Bioinformatics

Artigo Revisado por pares

Jia‐Chun Lin, Fang‐Yie Leu, Ying-ping Chen,

Recently, MapReduce has been widely employed by many companies/organizations to tackle data-intensive problems over a large-scale MapReduce cluster. To solve machine/node failure which is inevitable in a MapReduce cluster, MapReduce employs several policies, such as input-data replication ... data replication policies. To speed up job execution, MapReduce allows reduce tasks to early fetch their required ... job energy consumption (JEC for short) of a MapReduce cluster was not clear, where JCR is the reliability with which a MapReduce job can be completed by the cluster, whereas ...

Tópico(s): Blockchain Technology Applications and Security

2014 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Artigo Acesso aberto

Liya Thomas, R Syama,

MapReduce is a programming model used by Google to process large amount of data in a distributed ... a file system or a database usually occurs.MapReduce takes the advantage of locality of data, processing ... failures hiding the complexity of fault tolerance make MapReduce to be used for both commercial and scientific applications.As MapReduce clusters have become popular these days, their scheduling ... considered.In order to achieve good performance a MapReduce scheduler must avoid unnecessary data transmission.Hence different ...

Tópico(s): Artificial Intelligence in Healthcare

2014 - | International Journal of Computer Applications

Artigo Revisado por pares

Junbo Zhang, Dong Xiang, Tianrui Li, Yi Pan,

MapReduce is a very popular parallel programming model for cloud computing platforms, and has become an effective ... by using a cluster of computers. X-to-MapReduce (X is a program language) translator is a ... to cloud systems through translating sequential codes to MapReduce codes. Recently, some SQL-to-MapReduce translators emerge to translate SQL-like queries to MapReduce codes and have good performance in cloud systems. However, SQL-to-MapReduce translators mainly focus on SQL-like queries, but ... We propose and develop a simple Matlab-to-MapReduce translator for cloud computing, called M2M, for basic ...

Tópico(s): Advanced Data Storage Technologies

2013 - Tsinghua University Press | Tsinghua Science & Technology

Artigo Acesso aberto Revisado por pares

Xiaoyong Xu, Maolin Tang,

... is the most important objective of cloud-based MapReduce computations.Minimizing the total computation cost of cloud-based MapReduce computations is done through MapReduce placement optimization.MapReduce placement optimization approaches can be classified into two categories: homogeneous MapReduce placement optimization and heterogeneous MapReduce placement optimization.It is generally believed that heterogeneous MapReduce placement optimization is more effective than homogeneous MapReduce placement optimization in reducing the total running cost ...

Tópico(s): Data Management and Algorithms

2015 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Services Computing

Artigo Revisado por pares

Zeke Wang, Shuhao Zhang, Bingsheng He, Wei Zhang,

MapReduce, originally developed by Google for search applications, has recently become a popular programming framework for parallel ... paper presents an energy-efficient architecture design for MapReduce on Field Programmable Gate Arrays (FPGAs). The major ... to enable users to program FPGAs with simple MapReduce interfaces, and meanwhile to embrace automatic performance optimizations within the MapReduce framework. Compared to other processors like CPUs and ... energy consumption. However, the design and implementation of MapReduce on FPGAs can be challenging: firstly, FPGAs are ...

Tópico(s): Advanced Data Storage Technologies

2016 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Artigo Acesso aberto Revisado por pares

Shivani Sharma, Durga Toshniwal,

... for experimenting with Big Data approaches (e.g., MapReduce Framework). We have agglomerated the MapReduce framework with adopted heuristics to overcome this challenge ... yields efficient analytic results within bounded execution times. MapReduce is a parallel programming framework [16] which provides ... resources to deal with the Big Data analytics. MapReduce allows the resource of a largely distributed system ... fault-tolerance are the key features which make MapReduce a promising framework. Therefore, we have proposed a ...

Tópico(s): Blockchain Technology Applications and Security

2017 - Springer Science+Business Media | Journal Of Big Data

Artigo Acesso aberto Revisado por pares

Phongphun Kijsanayothin, Gantaphon Chalumporn, Rattikorn Hewett,

... many Big Data analytics algorithms are contributed by MapReduce, a programming paradigm that enables parallel and distributed ... Much research has focused on building efficient naive MapReduce-based algorithms or extending MapReduce mechanisms to enhance performance. However, we argue that ... directions to pursue. We conjecture that when naive MapReduce-based solutions do not perform well, it could ... certain classes of algorithms are not amendable to MapReduce model and one should find a fundamentally different ...

Tópico(s): Big Data and Business Intelligence

2019 - Springer Science+Business Media | Journal Of Big Data

Capítulo de livro

Fabrizio Marozzo, Domenico Talia, Paolo Trunfio,

MapReduce is a programming model widely used in Cloud computing environments for processing large data sets in a highly parallel way. MapReduce implementations are based on a master-slave model. ... while master failures are not managed by current MapReduce implementations, as designers consider failures unlikely in reliable ... manage master failures is fundamental to exploit the MapReduce model in the implementation of data-intensive applications in those dynamic Cloud environments where current MapReduce implementations could be unreliable. The goal of our ...

Tópico(s): Distributed and Parallel Computing Systems

2010 - | Computer communications and networks

Artigo Acesso aberto Revisado por pares

Herodotos Herodotou, Fei Dong, Shivnath Babu,

MapReduce has emerged as a viable competitor to database systems in big data analytics. MapReduce programs are being written for a wide variety ... and social network analysis, and computational science. However, MapReduce systems lack a feature that has been key ... A major challenge here is that, to the MapReduce system, a program consists of black-box map ... Cost-based Optimizer for simple to arbitrarily complex MapReduce programs. Starfish also includes a Profiler to collect detailed statistical information from unmodified MapReduce programs, and a What-if Engine for fine- ...

Tópico(s): Data Management and Algorithms

2011 - Association for Computing Machinery | Proceedings of the VLDB Endowment

Artigo Acesso aberto Revisado por pares

Herodotos Herodotou, Shivnath Babu,

MapReduce has emerged as a viable competitor to database systems in big data analytics. MapReduce programs are being written for a wide variety ... and social network analysis, and computational science. However, MapReduce systems lack a feature that has been key ... A major challenge here is that, to the MapReduce system, a program consists of black-box map ... Cost-based Optimizer for simple to arbitrarily complex MapReduce programs. We focus on the optimization opportunities presented ... Profiler to collect detailed statistical information from unmodified MapReduce programs, and a What-if Engine for fine- ...

Tópico(s): Advanced Data Storage Technologies

2011 - Association for Computing Machinery | Proceedings of the VLDB Endowment

Artigo Acesso aberto Revisado por pares

Fabrizio Marozzo, Domenico Talia, Paolo Trunfio,

MapReduce is a programming model for parallel data processing widely used in Cloud computing environments. Current MapReduce implementations are based on centralized master-slave architectures ... at high rates. We have designed an adaptive MapReduce framework, called P2P-MapReduce, which exploits a peer-to-peer model to ... way, so as to provide a more reliable MapReduce middleware that can be effectively exploited in dynamic Cloud infrastructures. This paper describes the P2P-MapReduce system providing a detailed description of its basic ...

Tópico(s): Advanced Data Storage Technologies

2011 - Elsevier BV | Journal of Computer and System Sciences

Artigo Revisado por pares

Satish Narayana Srirama, Pelle Jakovits, Eero Vainikko,

... can successfully exploit the cloud resources, like the MapReduce framework. This paper summarizes the challenges associated with reducing iterative algorithms to the MapReduce model. Algorithms used by scientific computing are divided ... by how they can be adapted to the MapReduce model; examples from each such class are reduced to the MapReduce model and their performance is measured and analyzed. The study mainly focuses on the Hadoop MapReduce framework but also compares it to an alternative MapReduce framework called Twister, which is specifically designed for ...

Tópico(s): Cloud Data Security Solutions

2011 - Elsevier BV | Future Generation Computer Systems

Artigo Acesso aberto Revisado por pares

Fred Highland, John Stephenson,

While the Hadoop MapReduce paradigm offers a linearly scalable approach to solving many complex problems, it does not work for every problem type. ... problems that can and cannot be solved with MapReduce have been discussed in a number of sources but the requirements for effective use of MapReduce are not clear. This paper takes the approach ... must be understood to implement a solution in MapReduce. The paper examines the MapReduce paradigm and derives the key requirements and constraints ... algorithm and data specifications make effective use of MapReduce. These characteristics can also provide guidance to refactoring ...

Tópico(s): Big Data and Business Intelligence

2012 - Elsevier BV | Procedia Computer Science

Artigo Acesso aberto Revisado por pares

Rong Chen, Haibo Chen,

... clusters on a single machine with many cores. MapReduce, a simple and elegant programming model to program ... new challenges to design and implement an efficient MapReduce system on multicore. This article argues that it is more efficient for MapReduce to iteratively process small chunks of data in ... Based on the argument, we extend the general MapReduce programming model with a “tiling strategy”, called Tiled - MapReduce (TMR). TMR partitions a large MapReduce job into a number of small subjobs and ... of all subjobs for output. Based on Tiled-MapReduce, we design and implement several optimizing techniques targeting ...

Tópico(s): Advanced Data Storage Technologies

2013 - Association for Computing Machinery | ACM Transactions on Architecture and Code Optimization

Artigo Acesso aberto Revisado por pares

Pedro Ferrera, Ivan de Prado, Eric Palacios, Jose Luis Fernandez-Marquez, Giovanna Di Marzo Serugendo,

This paper presents Tuple MapReduce, a new foundational model extending MapReduce with the notion of tuples. Tuple MapReduce allows to bridge the gap between the low-level constructs provided by MapReduce and higher-level needs required by programmers, such ... well Pangool, an open-source framework implementing Tuple MapReduce. Pangool eases the design and implementation of applications based on MapReduce and increases their flexibility, still maintaining Hadoop’s ... database exploiting Pangool. These results show that Tuple MapReduce can be used as a direct, better-suited ...

Tópico(s): Data Quality and Management

2013 - Springer Science+Business Media | Knowledge and Information Systems

Artigo Revisado por pares

Rong Gu, Xiaoliang Yang, Jinshuang Yan, Yuanhao Sun, Bing Wang, Chunfeng Yuan, Yihua Huang,

... framework for big data processing today, the Hadoop MapReduce framework puts more emphasis on high-throughput of ... more and more big data applications developed with MapReduce require quick response time. As a result, improving the performance of MapReduce jobs, especially for short jobs, is of great ... approach to improve the performance of the Hadoop MapReduce framework by optimizing the job and task execution ... analyzing the job and task execution mechanism in MapReduce framework we reveal two critical limitations to job ...

Tópico(s): Advanced Data Storage Technologies

2013 - Elsevier BV | Journal of Parallel and Distributed Computing

Artigo Acesso aberto Revisado por pares

Xiao Ling, Yi Yuan, Dan Wang, Jiangchuan Liu, Jiahai Yang,

MapReduce-like frameworks have achieved tremendous success for large-scale data processing in data centers. A key feature distinguishing MapReduce from previous parallel models is that it interleaves ... models are therefore, unlikely to be applied to MapReduce directly. There are many recent studies on MapReduce job and task scheduling. These studies assume that ... assigned in advance. In current data centers, multiple MapReduce jobs of different importance levels run together. In this paper, we investigate a schedule problem for MapReduce taking server assignment into consideration as well. We ...

Tópico(s): Blockchain Technology Applications and Security

2016 - Elsevier BV | Journal of Parallel and Distributed Computing

Revisão Acesso aberto

T. Y. J. Naga Malleswari, G. Vadivu,

MapReduce, a programming model, allows parallel processing of large amount of data sets where various data mining ... their application. This paper gives an idea of MapReduce, its advantages and disadvantages. This paper also focuses on how MapReduce is used, how map and reduce computations are ... by using pilot abstractions. We also represent how MapReduce used for deduplication of files to save disk space in data centers. MapReduce based Pre-Post (MRPre-Post) a parallel data ... is adapted in Hadoop platform to achieve scalability. MapReduce is implemented in vHadoop (Virtual Hadoop), a scalable ...

Tópico(s): Data Mining Algorithms and Applications

2016 - Indian Society for Education and Environment | Indian Journal of Science and Technology

Artigo Acesso aberto

Philip Derbeko, Shlomi Dolev, Ehud Gudes, Shantanu Sharma,

MapReduce is a programming system for distributed processing of large-scale data in an efficient and fault ... manner on a private, public, or hybrid cloud. MapReduce is extensively used daily around the world as ... social networks. Security and privacy of data and MapReduce computations are essential concerns when a MapReduce computation is executed in public or hybrid clouds. In order to execute a MapReduce job in public and hybrid clouds, authentication of ... from several types of attacks on data and MapReduce computations. In this paper, we investigate and discuss ...

Tópico(s): Stochastic Gradient Optimization Techniques

2016 - Elsevier BV | Computer Science Review

Artigo Revisado por pares

Bing Tang, Mingdong Tang, Gilles Fedak, Hong S. He,

MapReduce offers an ease-of-use programming paradigm for processing large datasets. In our previous work, we have designed a MapReduce framework called BitDew-MapReduce for desktop grid and volunteer computing environment, that allows nonexpert users to run data-intensive MapReduce jobs on top of volunteer resources over the ... distance and resource availability have great impact on MapReduce applications running over the Internet. To address this, an availability and network-aware MapReduce framework over the Internet is proposed. Simulation results ...

Tópico(s): Caching and Content Delivery

2016 - Elsevier BV | Information Sciences

Artigo Revisado por pares

Pandu Sowkuntla, P. S. V. S. Sai Prasad,

... systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation ... this field for large-scale decision systems using MapReduce . The existing MapReduce based reduct computation approaches use horizontal partitioning (division ... a simplified shuffle and sort phase of the MapReduce framework . MR_IQRA_VP is a distributed/parallel ... Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark . We have done an ... in the areas of Bioinformatics and Web mining . • MapReduce based attribute reduction algorithm is proposed using Rough ...

Tópico(s): Text and Document Classification Technologies

2019 - Elsevier BV | Knowledge-Based Systems

Artigo Revisado por pares

Saeed Mirpour Marzuni, Abdorreza Savadi, Adel N. Toosi, Mahmoud Naghibzadeh,

The MapReduce model is widely used to store and process big data in a distributed manner. MapReduce was originally developed for a single tightly coupled ... Geo-Hadoop are designed to address geo-distributed MapReduce processing. However, these methods still suffer from high ... to a single global reducer, we propose Cross-MapReduce, a framework for geo-distributed MapReduce processing. Before any massive data transfer, our proposed ... real testbed to demonstrate the effectiveness of Cross-MapReduce. The experimental results show that Cross-MapReduce significantly ...

Tópico(s): Privacy-Preserving Technologies in Data

2020 - Elsevier BV | Future Generation Computer Systems

Artigo Acesso aberto Revisado por pares

Wei-Chun Chung, Chien-Chih Chen, Jan-Ming Ho, Chung‐Yen Lin, Wen−Lian Hsu, Yu‐Chun Wang, D. T. Lee, Feipei Lai, Chih‐Wei Huang, Yu-Jung Chang,

... environment for large-scale data analysis. Using a MapReduce framework, data and workload can be distributed via ... the cloud to substantially reduce computational latency. Hadoop/MapReduce has been successfully adopted in bioinformatics for genome ... Hadoop cloud for those who prefer to run MapReduce programs in a cluster without built-in Hadoop/MapReduce. Results We present CloudDOE, a platform-independent software ... Operate wizard allows the user to run a MapReduce application on the dashboard list. To extend the ...

Tópico(s): Distributed and Parallel Computing Systems

2014 - Public Library of Science | PLoS ONE

Artigo Revisado por pares

Wenbin Fang, Bingsheng He, Qiong Luo, Naga K. Govindaraju,

We design and implement Mars, a MapReduce runtime system accelerated with graphics processing units (GPUs). MapReduce is a simple and flexible parallel programming paradigm originally proposed by Google, for ... less familiar than those on the CPUs to MapReduce programmers. To harness GPUs' power for MapReduce, we developed Mars to run on NVIDIA GPUs, ... Mars into Hadoop, an open-source CPU-based MapReduce system. Mars hides the programming complexity of GPUs behind the simple and familiar MapReduce interface, and automatically manages task partitioning, data distribution, ...

Tópico(s): Parallel Computing and Optimization Techniques

2010 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Artigo Revisado por pares

Saurabh Sehgal, Miklós Erdélyi, André Merzky, Shantenu Jha,

... Our approach is: (i) Given the simplicity of MapReduce, its widespread usage, and its ability to capture the primary challenges of developing distributed applications, use MapReduce as the underlying exemplar; we develop an interoperable implementation of MapReduce using SAGA — an API to support distributed programming, ( ... the canonical wordcount application that uses SAGA-based MapReduce, we investigate its scale-out across clusters, clouds ... iii) Establish the execution of wordcount application using MapReduce and other programming models such as Sphere concurrently. ...

Tópico(s): Advanced Data Storage Technologies

2010 - Elsevier BV | Future Generation Computer Systems

Artigo Revisado por pares

Da David Jiang, Anthony K. H. Tung, Gang Chen,

... data to be processed over very large clusters. MapReduce is recognized as a popular way to handle ... However, compared to parallel databases, the performance of MapReduce is slower when it is adopted to perform ... compute certain aggregates. A common concern is whether MapReduce can be improved to produce a system with ... Join-Reduce, a system that extends and improves MapReduce runtime framework to efficiently process complex data analysis ... join-aggregation programming model, a natural extension of MapReduce's filtering-aggregation programming model. Then, we present ...

Tópico(s): Advanced Database Systems and Queries

2010 - IEEE Computer Society | IEEE Transactions on Knowledge and Data Engineering

Artigo Revisado por pares

Steven J. Plimpton, Karen Devine,

... that allows algorithms to be expressed in the MapReduce paradigm. This means the calling program does not ... data movement between processors. We describe how typical MapReduce functionality can be implemented in an MPI context, ... to enable graph algorithms to be written as MapReduce operations, allowing processing of terabyte-scale data sets on traditional MPI-based clusters. We outline MapReduce versions of several such algorithms: vertex ranking via ... generate randomized R-MAT matrices in parallel; a MapReduce version of this operation is also described. Performance ...

Tópico(s): Advanced Graph Neural Networks

2011 - Elsevier BV | Parallel Computing