BEGIN:VCALENDAR
VERSION:2.0
PRODID:Linklings LLC
BEGIN:VTIMEZONE
TZID:Europe/Stockholm
X-LIC-LOCATION:Europe/Stockholm
BEGIN:DAYLIGHT
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
TZNAME:CEST
DTSTART:19700308T020000
RRULE:FREQ=YEARLY;BYMONTH=3;BYDAY=-1SU
END:DAYLIGHT
BEGIN:STANDARD
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
TZNAME:CET
DTSTART:19701101T020000
RRULE:FREQ=YEARLY;BYMONTH=10;BYDAY=-1SU
END:STANDARD
END:VTIMEZONE
BEGIN:VEVENT
DTSTAMP:20210527T134948Z
LOCATION:Analog 1
DTSTART;TZID=Europe/Stockholm:20200623T164500
DTEND;TZID=Europe/Stockholm:20200623T165000
UID:isc_hpc_ISC High Performance 2020_sess325_post115@linklings.com
SUMMARY:Randomized SVD on TensorCores
DESCRIPTION:Research Poster\n\nRandomized SVD on TensorCores\n\nOotomo, Yo
kota\n\nLow-rank approximation is vital and widely used for data compressi
on, dimensionality reduction and noise reduction.\nRandomized SVD, which r
equires the computation of a QR factorization of a tall skinny matrix (TSQ
R), is a robust and efficient algorithm for computing a low-rank approxima
tion.\nWe implement a TSQR that runs efficiently in a parallel environment
with TensorCores and use it to compute a Randomized SVD.\nIn TSQR, the QR
factorization is done by dividing the input matrix into a column of block
s and recursively calculating the QR factorization of each resulting subma
trix.\nIn randomized SVD, the matrix for which a TSQR needs to be computed
is not skinny enough for our TensorCore TSQR implementation.\nWe thus emp
loy a BlockQR algorithm that splits the matrix into skinny enough submatri
ces and consecutively applies the TSQR to them.\nSome additional calculati
ons are necessary between the consecutive TSQR applications to retrieve th
e QR factorization of the full matrix.\nTensorCores are specialized hardwa
re for matrix multiplication and addition and are available on the latest
NVIDIA GPUs.\nConverting input matrices to half-precision on TensorCores r
esults in loss of accuracy.\nWe recover the accuracy by using an accuracy
correction technique that leverages the single-precision multiplication an
d addition of TensorCores.\nWe evaluate the speed, accuracy and stability
of the Randomized SVD on TensorCores.\nUsing TensorCores and correction te
chniques, our approach can calculate a Randomized SVD without much loss of
accuracy.\nOur approach also provides 1.5x faster performance in some cas
es and reduces working memory by 33% compared to cuSOLVER.\n\nTag: Pre-Rec
orded
URL:https://2020.isc-program.com/presentation/?id=post115&sess=sess325
END:VEVENT
END:VCALENDAR