Buscar assunto

Portal de Periódicos da CAPES

Sobre
Acervo
Treinamentos
- Calendário
- Materiais de apoio
Informativos
Ajuda

Redes Sociais

Olá.

Escopo da Busca:

Filtros de busca

Tipo de Material

Limpar

Busca Avançada

7.050 resultados

Expandir meus resultados

Acesso aberto

Não 3625

Sim 3425

Tipo do recurso

Selecionar todos

Artigo

5912

Capítulo de livro

1022

Revisão

Carta

Editorial

Errata

Paratexto

Livro

Conjunto de dados

Ano de criação

Até

Produção nacional

Não 6869

Sim 181

Revisado por pares

Sim 5585

Não 1465

Áreas

Selecionar todos

Ciências Exatas e da Terra

4663

Multidisciplinar

2907

Engenharias

1720

Ciências da Saúde

1151

Ciências Sociais Aplicadas

491

Ciências Biológicas

409

Ciências Humanas

133

Linguística, Letras e Artes

Ciências Agrárias

Idioma

Selecionar todos

Inglês

6864

Coreano

Português

Polonês

Francês

Russo

Espanhol

Alemão

Italiano

Japonês

Turco

Ucraniano

Dinamarquês

Não Identificado

Africâner

Croata

Estoniano

Lituano

Romeno

Tcheco

Editores

Selecionar todos

Elsevier BV

1408

Springer Science+Business Media

1022

Wiley

443

Institute of Electrical and Electronics Engineers

334

SPIE

226

Multidisciplinary Digital Publishing Institute

222

Springer Nature

215

Association for Computing Machinery

155

IOP Publishing

153

American Institute of Physics

109

Taylor & Francis

103

Oxford University Press

SAGE Publishing

BioMed Central

Hindawi Publishing Corporation

Trans Tech Publications

Lippincott Williams & Wilkins

EDP Sciences

Springer International Publishing

Inderscience Publishers

Frontiers Media

American Chemical Society

World Scientific

Science Press

RELX Group (Netherlands)

George A. Smathers Libraries

Wydawnictwa AGH

Florida Entomological Society

IGI Global

Optica Publishing Group

Public Library of Science

Cambridge University Press

De Gruyter

Institute of Electronics, Information and Communication Engineers

Institution of Engineering and Technology

Sriwijaya University

IOS Press

Science and Information Organization

ACM SIGARCH

Computer Application in Sichuan Province

East China Computer Technology Research Institute

Pleiades Publishing

American Association of Immunologists

Copernicus Publications

IEEE Antennas & Propagation Society

Society for Industrial and Applied Mathematics

Nature Portfolio

Acoustical Society of America

De Gruyter Open

Institute of Physics

Selecionar tudo

Filtrar

Exportar

BIBTEX RIS

Capítulo de livro Acesso aberto Revisado por pares

1. Automatic C-to-CUDA Code Generation for Affine Programs

Muthu Manikandan Baskaran, J. Ramanujam, P. Sadayappan,

Graphics Processing Units (GPUs) offer tremendous computational power. CUDA (Compute Unified Device Architecture) provides a multi-threaded ... parallel view make manual development of high-performance CUDA code rather complicated. Hence the automatic transformation of sequential input programs into efficient parallel CUDA programs is of considerable interest. This paper describes an automatic code transformation system that generates parallel CUDA code from input sequential C code, for regular ( ... optimization practically effective, we develop a C-to-CUDA transformation system that generates two-level parallel CUDA ...

Tópico(s): Real-Time Systems Scheduling

2010 - Springer Science+Business Media | Lecture notes in computer science

Ver no editor

Lecture notes in computer science

Artigo Revisado por pares

2. Swan: A tool for porting CUDA programs to OpenCL

M J Harvey, Gianni De Fabritiis,

... The majority of this work has used the CUDA programming model supported exclusively by GPUs manufactured by ... Swan" for facilitating the conversion of an existing CUDA code to use the OpenCL model, as a means to aid programmers experienced with CUDA in evaluating OpenCL and alternative hardware. While the performance of equivalent OpenCL and CUDA code on fixed hardware should be comparable, we find that a real-world CUDA application ported to OpenCL exhibits an overall 50% ... portable GPU applications but that the more mature CUDA tools continue to provide best performance. Program title: ...

Tópico(s): Software Testing and Debugging Techniques

2011 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications

Artigo Revisado por pares

3. Improved CUDA programs for GPU computing of Swendsen–Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

Yukihiro Komura, Yutaka Okabe,

We present new versions of sample CUDA programs for the GPU computing of the Swendsen–Wang multi-cluster spin flip algorithm. In this update, we add the method of ... 26316 Distribution format: tar.gz Programming language: C, CUDA. Computer: System with an NVIDIA CUDA enabled GPU. Operating system: No limits (tested on ... multi-cluster spin flip Monte Carlo method. The CUDA implementation for the cluster-labeling is based on ... for high-precision Monte Carlo simulations. In the CUDA, the cuRAND library [2], which focuses on the ...

Tópico(s): Random Matrices and Applications

2015 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications

Artigo Revisado por pares

4. CUDA‐quicksort: an improved GPU‐based implementation of quicksort

Emanuele Manca, Andrea Manconi, Alessandro Orro, Giuliano Armano, Luciano Milanesi,

... the GPU‐quicksort, a compute‐unified device architecture (CUDA) iterative implementation, and the CUDA dynamic parallel (CDP) quicksort, a recursive implementation provided by NVIDIA Corporation. We propose CUDA‐quicksort an iterative GPU‐based implementation of the sorting algorithm. CUDA‐quicksort has been designed starting from GPU‐quicksort. ... performed on six sorting benchmark distributions show that CUDA‐quicksort is up to four times faster than ... An in‐depth analysis of the performance between CUDA‐quicksort and GPU‐quicksort shows that the main ...

Tópico(s): Advanced Data Storage Technologies

2015 - Wiley | Concurrency and Computation Practice and Experience

Ver no editor

Concurrency and Computation Practice and Experience

Artigo Acesso aberto

Produção Nacional Revisado por pares

5. OpenMP, OpenMP/MPI, and CUDA/MPI C programs for solving the time-dependent dipolar Gross–Pitaevskii equation

Vladimir Lončar, Luis E. Young-S., Srdjan Škrbić, Paulsamy Muruganandam, Sadhan K. Adhikari, Antun Balaž,

... new versions of the previously published C and CUDA programs for solving the dipolar Gross–Pitaevskii equation ... on distributed-memory systems. Finally, previous three-dimensional CUDA-parallelized programs are further parallelized using MPI, similarly ... comparison with the previous sequential C and parallel CUDA programs. The improvements to the sequential version yield ... on a computer cluster with 32 nodes used. CUDA/MPI version shows a speedup of 9–10 ... with 32 nodes. Program Title: DBEC-GP-OMP-CUDA-MPI: (1) DBEC-GP-OMP package: (i) imag1dX- ...

Tópico(s): Cold Atom Physics and Bose-Einstein Condensates

2016 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications arXiv (Cornell University) Americanae (AECID Library) arXiv (Cornell University) DataCite API

Artigo Revisado por pares

6. Efficient fingerprint matching using GPU

Mubeen Ghafoor, Shahzaib Iqbal, Syed Ali Tariq, Imtiaz Ahmad Taj, Noman M. Jafri,

... NVIDIA [23-25] introduced 'compute unified device architecture' (CUDA) in 2006. GPUs have been used efficiently in ... 2. Section 3 discusses the GPU and NVIDIA CUDA architecture. Section 4 discusses the proposed implementation of ... overview of the GPU architecture and introduces NVIDIA CUDA programming architecture. 3 GPU and NVIDIA CUDA architecture To transform or map CPU algorithm to ... power of GPU can be optimally utilised. NVIDIA CUDA is the hardware/software architecture where hardware architecture ...

Tópico(s): Forensic Fingerprint Detection Methods

2017 - Institution of Engineering and Technology | IET Image Processing

Ver no editor

IET Image Processing

Artigo Acesso aberto Revisado por pares

7. Modeling and analyzing evaluation cost of CUDA kernels

Stefan K. Muller, Jan Hoffmann,

... high throughput in vector-parallel applications. NVIDIA's CUDA toolkit seeks to make GPGPU programming accessible by ... small extension of C/C++. However, due to CUDA's complex execution model, the performance characteristics of CUDA kernels are difficult to predict, especially for novice ... paper introduces a novel quantitative program logic for CUDA kernels, which allows programmers to reason about both functional correctness and resource usage of CUDA kernels, paying particular attention to a set of ...

Tópico(s): Embedded Systems Design Techniques

2021 - Association for Computing Machinery | Proceedings of the ACM on Programming Languages

Ver no editor

Proceedings of the ACM on Programming Languages

Artigo Revisado por pares

8. Analysis of the promoter of the cudA gene reveals novel mechanisms of Dictyostelium cell type differentiation

Masashi Fukuzawa, Jeffrey G. Williams,

ABSTRACT The cudA gene encodes a nuclear protein that is essential for normal multicellular development. At the slug stage cudA is expressed in the prespore cells and in ... show that cap site distal promoter sequences direct cudA expression in prespore cells, while proximal sequences direct ... acting part of the prespore domain of the cudA promoter. However, Dd-STATa cannot be utilised for ... shows that Dd-STATa is not necessary for cudA transcription in prespore cells. In contrast, the part of the cudA promoter that directs prestalk-specific expression contains a ...

Tópico(s): Biocrusts and Microbial Ecology

2000 - The Company of Biologists | Development

Ver no editor

Development PubMed

Artigo Revisado por pares

9. Implementation of the CPU/GPU hybrid parallel method of characteristics neutron transport calculation using the heterogeneous cluster with dynamic workload assignment

Peitao Song, Zhijian Zhang, Qian Zhang, Liang Liang, Qiang Zhao,

... cluster. In this paper, a heterogeneous MPI + OpenMP/CUDA parallel algorithm for solving the 2D neutron transport ... exploited through OpenMP (in CPU calculated domain) and CUDA (in GPU calculated domain) based on the ray ... Moreover, the strong scaling performance of the MPI + CUDA parallelization is studied through a performance analysis model ... GPUs, and the MPI communication in the MPI + CUDA parallel algorithm. And the corresponding conclusion is still tenable for the MPI + OpenMP/CUDA parallelization. The C5G7 2D benchmark and an extended ...

Tópico(s): Advanced Neural Network Applications

2019 - Elsevier BV | Annals of Nuclear Energy

Ver no editor

Annals of Nuclear Energy

Artigo Acesso aberto Revisado por pares

10. A new family of transcription factors

Yoko Yamada, Hong Yu Wang, Masashi Fukuzawa, Geoffrey J. Barton, Jeffrey G. Williams,

CudA, a nuclear protein required for Dictyostelium prespore-specific gene expression, binds in vivo to the promoter ... 14 nucleotide region of the cotC promoter binds CudA in vitro and ECudA, an Entamoeba CudA homologue, also binds to this site. The CudA and ECudA DNA-binding sites contain a dyad and, consistent with a symmetrical binding site, CudA forms a homodimer in the yeast two-hybrid system. Mutation of CudA binding sites within the cotC promoter reduces expression from cotC in prespore cells. The CudA and ECudA proteins share a 120 amino acid ...

Tópico(s): interferon and immune responses

2008 - The Company of Biologists | Development

Ver no editor

Development Europe PMC (PubMed Central) PubMed Central PubMed

Artigo Acesso aberto Revisado por pares

11. Fast Morphological Image Processing Open-Source Extensions for GPU Processing With CUDA

Matthew J. Thurley, V. Danell,

... for faster morphological image processing, and the NVIDIA CUDA architecture offers a relatively inexpensive and powerful framework ... generic morphological erosion and dilation operation in the CUDA NPP library is relatively naive, and performance scales ... morphological image processing community. Open-source extensions to CUDA (hereafter referred to as LTU-CUDA) have been produced for erosion and dilation using ... by forgoing the use of shared memory in CUDA multiprocessors. The vHGW algorithm for erosion and dilation ...

Tópico(s): Advanced Neural Network Applications

2012 - Institute of Electrical and Electronics Engineers | IEEE Journal of Selected Topics in Signal Processing

Ver no editor

IEEE Journal of Selected Topics in Signal Processing KTH Publication Database DiVA (KTH Royal Institute of Technology)

Artigo

12. CUDA-NP

Yi Yang, Huiyang Zhou,

... parallel program, such as a GPU kernel in CUDA programs, still contains both se-quential code and ... our proposed solution to exploit nested parallelism in CUDA, referred to as CUDA-NP. With CUDA-NP, we initially enable a high number of ... for different code sections. We implemented our proposed CUDA-NP framework using a directive-based compiler approach. ... like pragmas for parallelizable code sections. Then, our CUDA-NP compiler automatically gen-erates the optimized GPU ... optimized and contain nested parallelism, our pro-posed CUDA-NP framework further improves the perfor-mance by ...

Tópico(s): Interconnection Networks and Systems

2014 - Association for Computing Machinery | ACM SIGPLAN Notices

Ver no editor

ACM SIGPLAN Notices

Artigo Revisado por pares

13. Efficient compilation of CUDA kernels for high-performance computing on FPGAs

Alexandros Papakonstantinou, Karthik Gururaj, John A. Stratton, Deming Chen, Jason Cong, Wen‐mei Hwu,

... this work, we adapt one such language, the CUDA programming model, into a new FPGA design flow ... the coarse- and fine-grained parallelism exposed in CUDA onto the reconfigurable fabric. Our CUDA-to-FPGA flow employs AutoPilot, an advanced high- ... that transforms the SIMT (Single Instruction, Multiple Thread) CUDA code into task-level parallel C code for AutoPilot. We describe the details of our CUDA-to-FPGA flow and demonstrate the highly competitive ... best of our knowledge, this is the first CUDA-to-FPGA flow to demonstrate the applicability and ...

Tópico(s): Interconnection Networks and Systems

2013 - Association for Computing Machinery | ACM Transactions on Embedded Computing Systems

Ver no editor

ACM Transactions on Embedded Computing Systems

Artigo Acesso aberto Revisado por pares

14. CUDA-NP: Realizing Nested Thread-Level Parallelism in GPGPU Applications

Yi Yang, Chao Li, Huiyang Zhou,

... parallel program, such as a GPU kernel in CUDA programs, still contains both sequential code and parallel ... our proposed solution to exploit nested parallelism in CUDA, referred to as CUDA-NP. With CUDA-NP, we initially enable a high number of ... for different code sections. We implement our proposed CUDA-NP framework using a directive-based compiler approach. ... like pragmas for parallelizable code sections. Then, our CUDA-NP compiler automatically generates the optimized GPU kernels. ... been optimized and contain nested parallelism, our proposed CUDA-NP framework further improves the performance by up ...

Tópico(s): Interconnection Networks and Systems

2015 - Springer Science+Business Media | Journal of Computer Science and Technology

Ver no editor

Journal of Computer Science and Technology CiteSeer X (The Pennsylvania State University)

Artigo Acesso aberto Revisado por pares

15. Accelerating numerical solution of stochastic differential equations with CUDA

Michał Januszewski, Marcin Kostur,

... with popular NVIDIA Graphics Processing Units using the CUDA programming environment. We address general aspects of numerical ... etc.: 5905 Distribution format: tar.gz Programming language: CUDA C Computer: any system with a CUDA-compatible GPU Operating system: Linux RAM: 64 MB ... 3 External routines: The program requires the NVIDIA CUDA Toolkit Version 2.0 or newer and the ... and perform the calculations on GPUs using the CUDA programming environment. The GPU's ability to execute ... question is performed on a GPU using the CUDA environment. Running time: < 1 minute

Tópico(s): stochastic dynamics and bifurcation

2009 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications arXiv (Cornell University) DataCite API

Artigo Acesso aberto Revisado por pares

16. Utilisations digestive et métabolique comparées de la fève, de la lentille et du pois chiche chez le rat

Étiennette Combe, T. Achi, R. Pion, MC Valluy, ML Houlier, M. SALLAS, A. SELLE,

... satis- faire les besoins de la croissance.Le CUDa de l'azote est respectivement de 72 -75 - ... cas des lots fève -lentillepois chiche, mais le CUDa de certains acides aminés indispensables est nettement plus ... 71 - 75 pour la valine alors que le CUDa de l'arginine est toujours plus élevé 87 - ... to suit growth requirements.Nitrogen apparent digestibility coefficient (CUDa) was 72% in the faba bean, 75% in ... the chick P ea groups respectively, but the CUDa of some essential amino acids were much lower : ... cystine, 73 -71 -75% for valine, while arginine CUDa values (87 -87 -82) were higher than all ...

Tópico(s): Proteins in Food Systems

1991 - Elsevier BV | annales de biologie animale biochimie biophysique

Ver no editor

annales de biologie animale biochimie biophysique HAL (Le Centre pour la Communication Scientifique Directe) HAL (Le Centre pour la Communication Scientifique Directe) HAL (Le Centre pour la Communication Scientifique Directe) HAL (Le Centre pour la Communication Scientifique Directe)

Artigo Revisado por pares

17. Programming Massively Parallel Processors. A Hands-on Approach

Jie Cheng,

... by using an extension to C language, in CUDA which is a parallel programming environment supported on ... Hwu is principle investigator for the first NVIDIA CUDA Center of Excellence at the University of Illinois ... It also covers data parallelism, the basics of CUDA memory/threading models, the CUDA extensions to the C language, and the basic ... 7) enhances student programming skills by explaining the CUDA memory model and its types, strategies for reducing global memory traffic, the CUDA threading model and granularity which include thread scheduling ...

Tópico(s): Cloud Computing and Resource Management

2010 - | Scalable Computing Practice and Experience

Ver no editor

Scalable Computing Practice and Experience

Artigo Revisado por pares

18. CUDA-Based Radiative Transfer Method with Application to the EM Scattering from a Two-Layer Canopy Model

Wenqian Jiang, Menghao Zhang, Yichen Wang,

... from vegetations. Nevertheless, the Compute Unified Device Architecture (CUDA) gives developers access to the virtual instruction set ... memory of the parallel computational elements in the CUDA compatible Graphics Processing Unit (GPU), which encourages us to develop a CUDA-based simulator for the solution. This paper analyzes the radiative transfer method and the CUDA architecture, and then presents a CUDA parallel algorithm for calculating the EM scattering from a two-layer vegetation canopy. In the CUDA-based simulation, with a GTS250 GPU as, which ...

Tópico(s): Cryospheric studies and observations

2010 - Taylor & Francis | Journal of Electromagnetic Waves and Applications

Ver no editor

Journal of Electromagnetic Waves and Applications

Artigo Acesso aberto Revisado por pares

19. Parallel mutual information estimation for inferring gene regulatory networks on GPUs

Haixiang Shi, Bertil Schmidt, Weiguo Liu, Wolfgang Müller‐Wittig,

... we have used the Compute Unified Device Architecture (CUDA) programming model to design and implement a new parallel algorithm. Our implementation, called CUDA-MI, can achieve speedups of up to 82 ... datasets. We have used the results obtained by CUDA-MI to infer gene regulatory networks (GRNs) from ... existing methods including ARACNE and TINGe show that CUDA-MI produces GRNs of higher quality in less time.CUDA-MI is publicly available open-source software, written in CUDA and C++ programming languages. It obtains significant speedup ...

Tópico(s): DNA and Biological Computing

2011 - BioMed Central | BMC Research Notes

Ver no editor

BMC Research Notes DOAJ (DOAJ: Directory of Open Access Journals) Europe PMC (PubMed Central) PubMed Central PubMed

Artigo Acesso aberto

20. GPGPU Processing in CUDA Architecture

Jayshree Ghorpade-Aher,

... well.In this paper, we will show how CUDA can fully utilize the tremendous power of these GPUs.CUDA is NVIDIA's parallel computing architecture.It enables ... power of the GPU.This paper talks about CUDA and its architecture.It takes us through a comparison of CUDA C/C++ with other parallel programming languages like ... paper also lists out the common myths about CUDA and how the future seems to be promising for CUDA.

Tópico(s): Advanced Image and Video Retrieval Techniques

2012 - | Advanced Computing An International Journal

Ver no editor

Advanced Computing An International Journal arXiv (Cornell University) arXiv (Cornell University) DataCite API

Artigo Acesso aberto Revisado por pares

21. CUDA programs for the GPU computing of the Swendsen–Wang multi-cluster spin flip algorithm: 2D and 3D Ising, Potts, and XY models

Yukihiro Komura, Yutaka Okabe,

We present sample CUDA programs for the GPU computing of the Swendsen–Wang multi-cluster spin flip algorithm. We deal with the classical ... 14688 Distribution format: tar.gz Programming language: C, CUDA. Computer: System with an NVIDIA CUDA enabled GPU. Operating system: System with an NVIDIA CUDA enabled GPU. Classification: 23. External routines: NVIDIA CUDA Toolkit 3.0 or newer Nature of problem: ... multi-cluster spin flip Monte Carlo method. The CUDA implementation for the cluster-labeling is based on ...

Tópico(s): Random Matrices and Applications

2013 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications arXiv (Cornell University) DataCite API

Artigo

22. Accelerating kernel density estimation on the GPU using the CUDA framework

Panagiotis D. Michailidis, Konstantinos G. Margaritis,

... Processing Units (GPUs) using Compute Unied Device Architecture (CUDA) programming model. In this work we discuss a naive and two optimised CUDA algorithms for the two kernel estimation methods: univariate ... also present exploratory experimental results of the proposed CUDA algorithms according to the several values of parameters ... results show signicant performance gains of all proposed CUDA algorithms over serial CPU version and small performance speed-ups of the two optimised CUDA algorithms over naive GPU algorithms. Finally, based on ...

Tópico(s): Advanced Data Compression Techniques

2013 - | Applied Mathematical Sciences

Ver no editor

Applied Mathematical Sciences

Artigo Revisado por pares

23. COMPARISON OF PARALLEL PARTICLE SWARM OPTIMIZERS FOR GRAPHICAL PROCESSING UNITS AND MULTICORE PROCESSORS

Vincent Roberge, Mohammed Tarbouchi,

... optimization (PSO) on graphical processing units (GPU) using CUDA. By fully utilizing the processing power of graphic processors, our implementation (CUDA-PSO) provides a speedup of 167× compared to ... CPU, it may be unfair to compare our CUDA implementation to a sequential one. For this reason, ... MPI-PSO) and compared its performance against our CUDA-PSO. The execution time of our CUDA-PSO remains 15.8× faster than our MPI- ... statistical significance that the results obtained using our CUDA-PSO are of equal quality as the results ...

Tópico(s): Islanding Detection in Power Systems

2013 - Imperial College Press | International Journal of Computational Intelligence and Applications

Ver no editor

International Journal of Computational Intelligence and Applications

Artigo Revisado por pares

24. The identification of spirochetes from human cases of relapsing fever by xenodiagnosis with comments on local specificity of tick vectors

Gordon E. Davis,

... multi-level parallelism on GPU clusters with MPI-CUDA and hybrid MPI-OpenMP-CUDA parallel implementations, in which all computations are done on the GPU using CUDA. We explore efficiency and scalability of incompressible flow ... merging fine-grain parallelism on the GPU using CUDA with coarse-grain parallelism that use either MPI ... large data sets, and a dual-level MPI-CUDA implementation with maximum overlapping of computation and communication ... also find that our tri-level MPI-OpenMP-CUDA parallel implementation does not offer a significant advantage ...

Tópico(s): Plant Virus Research Studies

1956 - Elsevier BV | Experimental Parasitology

Ver no editor

Experimental Parasitology PubMed

Artigo Revisado por pares

25. CUDA–MEME: Accelerating motif discovery in biological sequences using CUDA-enabled graphics processing units

Yongchao Liu, Bertil Schmidt, Weiguo Liu, Douglas L. Maskell,

... to employ emerging many-core architectures such as CUDA-enabled GPUs. In this paper, we present a ... of the MEME motif discovery algorithm using the CUDA programming model. To achieve high efficiency, we introduce ... ZOOPS) motif search model. The runtime speedups of CUDA–MEME on a single GPU are also comparable ... workstation cluster. In addition to the fast speed, CUDA–MEME has the capability of finding motif instances ...

Tópico(s): Fractal and DNA sequence analysis

2009 - Elsevier BV | Pattern Recognition Letters

Ver no editor

Pattern Recognition Letters

Capítulo de livro Acesso aberto Revisado por pares

26. JCUDA: A Programmer-Friendly Interface for Accelerating Java Programs with CUDA

Yonghong Yan, Max Grossman, Vivek Sarkar,

... GPGPUs) to obtain order-of-magnitude performance improvements. CUDA has emerged as a popular programming model for ... and C#, it is natural to explore how CUDA-like capabilities can be made accessible to those ... can be used by Java programmers to invoke CUDA kernels. Using this interface, programmers can write Java codes that directly call CUDA kernels, and delegate the responsibility of generating the Java-CUDA bridge codes and host-device data transfer calls ...

Tópico(s): Advanced Data Storage Technologies

2009 - Springer Science+Business Media | Lecture notes in computer science

Ver no editor

Lecture notes in computer science

Artigo Revisado por pares

27. hiCUDA: High-Level GPGPU Programming

Tianyi David Han, Tarek S. Abdelrahman,

... GPU programmability. Although the Compute Unified Device Architecture (CUDA) is a simple C-like interface for programming NVIDIA GPUs, porting applications to CUDA remains a challenge to average programmers. In particular, CUDA places on the programmer the burden of packaging ... hiCUDA}, a high-level directive-based language for CUDA programming. It allows programmers to perform these tedious ... compiler that translates a hiCUDA} program to a CUDA program. Our compiler is able to support real- ... and use dynamically allocated arrays. Experiments using nine CUDA benchmarks show that the simplicity hiCUDA} provides comes ...

Tópico(s): Real-Time Systems Scheduling

2010 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Ver no editor

IEEE Transactions on Parallel and Distributed Systems

Artigo Acesso aberto Revisado por pares

28. Accelerating Wavelet Lifting on Graphics Hardware Using CUDA

Wladimir J. van der Laan, Andrei C. Jalba, Jos B. T. M. Roerdink,

... regarded as massively parallel coprocessors through NVidia's CUDA compute paradigm. The three main hardware architectures for ... based) are shown to be unsuitable for a CUDA implementation. Our CUDA-specific design can be regarded as a hybrid ... to an optimized CPU implementation and earlier non-CUDA-based GPU DWT methods, both for 2D images ... performance analysis shows that the results of our CUDA-specific design are in close agreement with our ...

Tópico(s): Digital Filter Design and Implementation

2010 - Institute of Electrical and Electronics Engineers | IEEE Transactions on Parallel and Distributed Systems

Ver no editor

IEEE Transactions on Parallel and Distributed Systems University of Groningen research database (University of Groningen / Centre for Information Technology) Data Archiving and Networked Services (DANS)

Artigo Revisado por pares

29. An object-oriented implementation of a solver of the time-dependent Schrödinger equation using the CUDA technology

Tomasz Dziubak, Jacek Matulewski,

... FFT algorithm. The solution is based on NVIDIA CUDA technology. The speed-up factor in the test ... format: tar.gz Programming language: C++, C for CUDA Computer: Graphics card with CUDA technology recommended Operating system: No limits (tested on ... of processors used – one CPU core and all CUDA cores of the selected processor of graphics card ... equation. Solution method: FFT and Chebyshev polynomial algorithm, CUDA technology. Running time: Every test example included in ...

Tópico(s): Spectroscopy and Quantum Chemical Studies

2011 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications

Artigo Revisado por pares

30. GPU-based four-dimensional general-relativistic ray tracing

Daniel Kuchelmeister, Thomas Müller, Marco Ament, Günter Wunner, Daniel Weiskopf,

... GPU using NVidia’s Compute Unified Device Architecture (CUDA), which leads to performance improvement of an order ... 1334251 Distribution format: tar.gz Programming language: C++, CUDA. Computer: Linux platforms with a NVidia CUDA enabled GPU (Compute Capability 1.3 or higher), C++ compiler, NVCC (The CUDA Compiler Driver). Operating system: Linux. RAM: 2 GB ... External routines: OpenGL Utility Toolkit development files, NVidia CUDA Toolkit 3.2, Lua5.2 Nature of problem: ... of light rays, GPU-based parallel programming using CUDA, 3D-Rendering via OpenGL. Running time: Problem dependent, ...

Tópico(s): Pulsars and Gravitational Waves Research

2012 - Elsevier BV | Computer Physics Communications

Ver no editor

Computer Physics Communications

Exibir

1–30 de 7.050 itens

Página

Entrar

Lembrar minha senha

Receber meu e-mail de confirmação