Volume 9, Number 1, March 2012
Walid J. Ghandour, Haitham Akkary, Wes Masri: Leveraging Strength-Based Dynamic Information Flow Analysis to Enhance Data Value Prediction. 1
Bita Mazloom, Shashidhar Mysore, Mohit Tiwari, Banit Agrawal, Timothy Sherwood: Dataflow Tomography: Information Flow Tracking For Understanding and Visualizing Full Systems. 3
Jung Ho Ahn, Norman P. Jouppi, Christos Kozyrakis, Jacob Leverich, Robert S. Schreiber: Improving System Energy Efficiency with Memory Rank Subsetting. 4
Xuejun Yang, Li Wang, Jingling Xue, Qingbo Wu: Comparability Graph Coloring for Optimizing Utilization of Software-Managed Stream Register Files for Stream Processors. 5
Abhinandan Majumdar, Srihari Cadambi, Michela Becchi, Srimat T. Chakradhar, Hans Peter Graf: A Massively Parallel, Energy Efficient Programmable Accelerator for Learning and Classification. 6
Volume 9, Number 2, June 2012
Stijn Eyerman, Lieven Eeckhout: Probabilistic modeling for job symbiosis scheduling on SMT processors. 7
Rachid Seghir, Vincent Loechner, Benoît Meister: Integer affine transformations of parametric ℤ-polytopes and applications to loop nest optimization. 8
Yi Yang, Ping Xiang, Jingfei Kong, Mike Mantor, Huiyang Zhou: A unified optimizing compiler framework for different GPGPU architectures. 9
Choonki Jang, Jaejin Lee, Bernhard Egger, Soojung Ryu: Automatic code overlay generation and partially redundant code fetch elimination. 10
Zahra Abbasi, Georgios Varsamopoulos, Sandeep K. S. Gupta: TACOMA: Server and workload management in internet data centers considering cooling-computing power trade-off and energy proportionality. 11
Andreas Lankes, Thomas Wild, Stefan Wallentowitz, Andreas Herkersdorf: Benefits of selective packet discard in networks-on-chip. 12
Volume 9, Number 3, September 2012
Yangchun Luo, Antonia Zhai: Dynamically dispatching speculative threads to improve sequential execution. 13
Huimin Cui, Jingling Xue, Lei Wang, Yang Yang, Xiaobing Feng, Dongrui Fan: Extendable pattern-oriented optimization directives. 14
Adam Wade Lewis, Nian-Feng Tzeng, Soumik Ghosh: Runtime energy consumption estimation for server workloads based on chaotic time-series approximation. 15
Alejandro Valero, Julio Sahuquillo, Salvador Petit, Pedro López, José Duato: Combining recency of information with selective random and a victim cache in last-level caches. 16
Polychronis Xekalakis, Nikolas Ioannou, Marcelo Cintra: Mixed speculative multithreaded execution models. 18
Diego Andrade, Basilio B. Fraguela, Ramon Doallo: Static analysis of the worst-case memory performance for irregular codes with indirections. 20
Yang Chen, Shuangde Fang, Yuanjie Huang, Lieven Eeckhout, Grigori Fursin, Olivier Temam, Chengyong Wu: Deconstructing iterative optimization. 21
Apala Guha, Kim M. Hazelwood, Mary Lou Soffa: Memory optimization of dynamic binary translators for embedded systems. 22
Volume 9, Number 4, January 2013

Jeremy Fowers, Greg Brown, John Robert Wernsing, Greg Stitt: A performance and energy comparison of convolution on GPUs, FPGAs, and multicore processors. 25
Erven Rohou, Kevin Williams, David Yuste: Vectorization technology to improve interpreter performance. 26
Yong Li, Rami G. Melhem, Alex K. Jones: PS-TLB: Leveraging page classification information for fast, scalable and efficient translation for future CMPs. 28
Kristof Du Bois, Stijn Eyerman, Lieven Eeckhout: Per-thread cycle accounting in multicore processors. 29
Christian Wimmer, Michael Haupt, Michael L. Van de Vanter, Mick J. Jordan, Laurent Daynès, Doug Simon: Maxine: An approachable virtual machine for, and in, java. 30
Malik Murtaza Khan, Protonu Basu, Gabe Rudy, Mary W. Hall, Chun Chen, Jacqueline Chame: A script-based autotuning compiler system to generate high-performance CUDA code. 31
Kenzo Van Craeynest, Lieven Eeckhout: Understanding fundamental design choices in single-ISA heterogeneous multicore architectures. 32
Samuel Antao, Leonel Sousa: The CRNS framework and its application to programmable and reconfigurable cryptography. 33
Boubacar Diouf, Can Hantas, Albert Cohen, Özcan Özturk, Jens Palsberg: A decoupled local memory allocator. 34
Huimin Cui, Qing Yi, Jingling Xue, Xiaobing Feng: Layout-oblivious compiler optimization for matrix computations. 35
Stephen Dolan, Servesh Muralidharan, David Gregg: Compiler support for lightweight context switching. 36
Pablo Abad, Valentin Puente, José-Ángel Gregorio: LIGERO: A light but efficient router conceived for cache-coherent chip multiprocessors. 37
Jorge Albericio, Pablo Ibáñez, Víctor Viñals, José María Llabería: Exploiting reuse locality on inclusive shared last-level caches. 38
Paraskevas Yiapanis, Demian Rosas-Ham, Gavin Brown, Mikel Luján: Optimizing software runtime systems for speculative parallelization. 39
Cedric Nugteren, Pieter Custers, Henk Corporaal: Algorithmic species: A classification of affine loop nests for parallel programming. 40
Zhichao Yan, Hong Jiang, Yujuan Tan, Dan Feng: An integrated pseudo-associativity and relaxed-order approach to hardware transactional memory. 42
Doris Chen, Deshanand P. Singh: Profile-guided floating- to fixed-point conversion for hybrid FPGA-processor applications. 43
Yan Cui, Yingxin Wang, Yu Chen, Yuanchun Shi: Lock-contention-aware scheduler: A scalable and energy-efficient method for addressing scalability collapse on multicore systems. 44
Kishore Kumar Pusukuri, Rajiv Gupta, Laxmi N. Bhuyan: ADAPT: A framework for coscheduling multithreaded programs. 45
Grigorios Chrysos, Panagiotis Dagritzikos, Ioannis Papaefstathiou, Apostolos Dollas: HC-CART: A parallel system implementation of data mining classification and regression tree (CART) algorithm on a multi-FPGA system. 47
Jongwon Lee, Yohan Ko, Kyoungwoo Lee, Jonghee M. Youn, Yunheung Paek: Dynamic code duplication with vulnerability awareness for soft error detection on VLIW architectures. 48
Carlos Luque, Miquel Moretó, Francisco J. Cazorla, Mateo Valero: Fair CPU time accounting in CMP+SMT processors. 50
Pavlos M. Mattheakis, Ioannis Papaefstathiou: Significantly reducing MPI intercommunication latency and power overhead in both embedded and HPC systems. 51
Riyadh Baghdadi, Albert Cohen, Sven Verdoolaege, Konrad Trifunovic: Improved loop tiling based on the removal of spurious false dependences. 52
Antoniu Pop, Albert Cohen: OpenStream: Expressiveness and data-flow compilation of OpenMP streaming programs. 53
Sven Verdoolaege, Juan Carlos Juega, Albert Cohen, José Ignacio Gómez, Christian Tenllado, Francky Catthoor: Polyhedral parallel code generation for CUDA. 54
Yu Du, Miao Zhou, Bruce R. Childers, Rami G. Melhem, Daniel Mossé: Delta-compressed caching for overcoming the write bandwidth limitation of hybrid main memory. 55
Mehmet E. Belviranli, Laxmi N. Bhuyan, Rajiv Gupta: A dynamic self-scheduling scheme for heterogeneous multiprocessor architectures. 57
Thibaut Lutz, Christian Fensch, Murray Cole: PARTANS: An autotuning framework for stencil computation on multi-GPU systems. 59
Chunhua Xiao, M.-C. Frank Chang, Jason Cong, Michael Gill, Zhangqin Huang, Chunyue Liu, Glenn Reinman, Hao Wu: Stream arbitration: Towards efficient bandwidth utilization for emerging on-chip interconnects. 60



