This paper is available on arxiv under CC BY-NC-ND 4.0 DEED license. Authors: Minghao Yan, University of Wisconsin-Madison; Hongyi Wang, Carnegie Mellon University; Shivaram Venkataraman, myan@cs.wisc.edu. Table of Links Abstract & Introduction Motivation Opportunities Architecture Overview Proble Formulation: Two-Phase Tuning Modeling Workload Interference Experiments Conclusion & References A. Hardware Details B. Experimental Results C. Arithmetic Intensity D.
This indicates that the Results default memory frequency is higher than optimal for modern Deep Learning workloads. For heavy workloads such as Bert, memory tuning can account for the majority of the energy consumption reduction. This can be partially attributed to the memory-bound nature of Transformer-based models . Our result demonstrates that systems that aim to optimize energy use in neural network inference need to take memory frequency into account.
Portugal Últimas Notícias, Portugal Manchetes
Similar News:Você também pode ler notícias semelhantes a esta que coletamos de outras fontes de notícias.
Fonte: hackernoon - 🏆 532. / 51 Consulte Mais informação »
Fonte: cleantechnica - 🏆 565. / 51 Consulte Mais informação »
Fonte: News4SA - 🏆 251. / 63 Consulte Mais informação »
Fonte: BreitbartNews - 🏆 610. / 51 Consulte Mais informação »
Fonte: cleantechnica - 🏆 565. / 51 Consulte Mais informação »