Profile Picture
  • All
  • Search
  • Images
  • Videos
    • Shorts
  • Maps
  • News
  • More
    • Shopping
    • Flights
    • Travel
  • Notebook
Report an inappropriate content
Please select one of the options below.
NVIDIA
H100
NVIDIA
H200
H100
Hgx
B300
Dgcx
H20 96G
H200 GPU Price
Supermicro Hgx
H200
Blackwell B200
B200
NVIDIA
Supermicro 827 14
GPU H200
Hgx
B200 Vedal
Wen Tian
Hgx
B200 8 GPU Gaming
Epson Hx 20
NVIDIA
B300 YouTube
Access to A100 GPU
Supermicro NVIDIA Hgx
B300 Nvl16
Supermicro Hgx
H100
Inspur Server
Sys 821Ge Tnhr H200
NVIDIA Hgx
H100 Made
NVIDIA
Dgx Spark
NVIDIA
H20
NVIDIA
Quadro M6000 24GB
RTX Pro 6000 Blackwell
NVIDIA
H100 拆解
Quadro M6000 12GB
H20
NVIDIA
  • Length
    AllShort (less than 5 minutes)Medium (5-20 minutes)Long (more than 20 minutes)
  • Date
    AllPast 24 hoursPast weekPast monthPast year
  • Resolution
    AllLower than 360p360p or higher480p or higher720p or higher1080p or higher
  • Source
    All
    Dailymotion
    Vimeo
    Metacafe
    Hulu
    VEVO
    Myspace
    MTV
    CBS
    Fox
    CNN
    MSN
  • Price
    AllFreePaid
  • Clear filters
  • SafeSearch:
  • Moderate
    StrictModerate (default)Off
Filter
    NVIDIA
    H100
    NVIDIA
    H200
    H100
    Hgx
    B300
    Dgcx
    H20 96G
    H200 GPU Price
    Supermicro Hgx
    H200
    Blackwell B200
    B200
    NVIDIA
    Supermicro 827 14
    GPU H200
    Hgx
    B200 Vedal
    Wen Tian
    Hgx
    B200 8 GPU Gaming
    Epson Hx 20
    NVIDIA
    B300 YouTube
    Access to A100 GPU
    Supermicro NVIDIA Hgx
    B300 Nvl16
    Supermicro Hgx
    H100
    Inspur Server
    Sys 821Ge Tnhr H200
    NVIDIA Hgx
    H100 Made
    NVIDIA
    Dgx Spark
    NVIDIA
    H20
    NVIDIA
    Quadro M6000 24GB
    RTX Pro 6000 Blackwell
    NVIDIA
    H100 拆解
    Quadro M6000 12GB
    H20
    NVIDIA
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
0:13
You now convert any LLM into a faster one without retraining from scratch.NVIDIA just did this to their 30B model. Here's the trick:1. Duplicate the model into two copies2. Freeze one copy, it just reads the prompt and remembers context3. Train the other copy to write chunks of text at once instead of one word at a time4. Run them togetherThe frozen copy barely costs anything (it's already trained). The new copy only needed ~8% of the original training data to learn the new trick.Result: 2.4x fa
103.4K views1 day ago
x.comLior Alexander
See more
Static thumbnail place holder
More like this
  • Privacy
  • Terms