3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis

Al Maashri Ahmed, Guangyu Sun, Xiangyu Dong, Vijay Narayanan, Yuan Xie

Research output: Chapter in Book/Report/Conference proceedingConference contribution

23 Citations (Scopus)

Abstract

Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53% geometric mean performance speedup for iso-cycle time architectures and about 19% for iso-cost architectures.

Original languageEnglish
Title of host publication2009 IEEE International Conference on Computer Design, ICCD 2009
Pages254-259
Number of pages6
DOIs
Publication statusPublished - 2009
Event2009 IEEE International Conference on Computer Design, ICCD 2009 - Lake Tahoe, CA, United States
Duration: Oct 4 2009Oct 7 2009

Other

Other2009 IEEE International Conference on Computer Design, ICCD 2009
CountryUnited States
CityLake Tahoe, CA
Period10/4/0910/7/09

Fingerprint

Thermoanalysis
Costs
Architectural design
Bandwidth
Graphics processing unit
Communication
Processing

ASJC Scopus subject areas

  • Hardware and Architecture
  • Electrical and Electronic Engineering

Cite this

Ahmed, A. M., Sun, G., Dong, X., Narayanan, V., & Xie, Y. (2009). 3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis. In 2009 IEEE International Conference on Computer Design, ICCD 2009 (pp. 254-259). [5413147] https://doi.org/10.1109/ICCD.2009.5413147

3D GPU architecture using cache stacking : Performance, cost, power and thermal analysis. / Ahmed, Al Maashri; Sun, Guangyu; Dong, Xiangyu; Narayanan, Vijay; Xie, Yuan.

2009 IEEE International Conference on Computer Design, ICCD 2009. 2009. p. 254-259 5413147.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Ahmed, AM, Sun, G, Dong, X, Narayanan, V & Xie, Y 2009, 3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis. in 2009 IEEE International Conference on Computer Design, ICCD 2009., 5413147, pp. 254-259, 2009 IEEE International Conference on Computer Design, ICCD 2009, Lake Tahoe, CA, United States, 10/4/09. https://doi.org/10.1109/ICCD.2009.5413147
Ahmed AM, Sun G, Dong X, Narayanan V, Xie Y. 3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis. In 2009 IEEE International Conference on Computer Design, ICCD 2009. 2009. p. 254-259. 5413147 https://doi.org/10.1109/ICCD.2009.5413147
Ahmed, Al Maashri ; Sun, Guangyu ; Dong, Xiangyu ; Narayanan, Vijay ; Xie, Yuan. / 3D GPU architecture using cache stacking : Performance, cost, power and thermal analysis. 2009 IEEE International Conference on Computer Design, ICCD 2009. 2009. pp. 254-259
@inproceedings{75a89a6b827c49cf8745dced6476607d,
title = "3D GPU architecture using cache stacking: Performance, cost, power and thermal analysis",
abstract = "Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53{\%} geometric mean performance speedup for iso-cycle time architectures and about 19{\%} for iso-cost architectures.",
author = "Ahmed, {Al Maashri} and Guangyu Sun and Xiangyu Dong and Vijay Narayanan and Yuan Xie",
year = "2009",
doi = "10.1109/ICCD.2009.5413147",
language = "English",
isbn = "9781424450282",
pages = "254--259",
booktitle = "2009 IEEE International Conference on Computer Design, ICCD 2009",

}

TY - GEN

T1 - 3D GPU architecture using cache stacking

T2 - Performance, cost, power and thermal analysis

AU - Ahmed, Al Maashri

AU - Sun, Guangyu

AU - Dong, Xiangyu

AU - Narayanan, Vijay

AU - Xie, Yuan

PY - 2009

Y1 - 2009

N2 - Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53% geometric mean performance speedup for iso-cycle time architectures and about 19% for iso-cost architectures.

AB - Graphics Processing Units (GPUs) offer tremendous computational and processing power. The architecture requires high communication bandwidth and lower latency between computation units and caches. 3D die-stacking technology is a promising approach to meet such requirements. To the best of our knowledge no other study has investigated the implementation of 3D technology in GPUs. In this paper, we study the impact of stacking caches using the 3D technology on GPU performance. We also investigate the benefits of using 3D stacked MRAM on GPUs. Our work includes cost, power, and thermal analysis of the proposed architectural designs. Our results show a 53% geometric mean performance speedup for iso-cycle time architectures and about 19% for iso-cost architectures.

UR - http://www.scopus.com/inward/record.url?scp=77951019767&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=77951019767&partnerID=8YFLogxK

U2 - 10.1109/ICCD.2009.5413147

DO - 10.1109/ICCD.2009.5413147

M3 - Conference contribution

AN - SCOPUS:77951019767

SN - 9781424450282

SP - 254

EP - 259

BT - 2009 IEEE International Conference on Computer Design, ICCD 2009

ER -