Zack Ankner

I am a third year undergraduate student at MIT currently studying Computer Science and Mathematics. I am currently a member of Jonathan Ragan-Kelley's lab where I work with William Brandon. I was previously a member of the Programming Systems Group (PSG) led by Michael Carbin, where I was supervised by Alex Renda. I have also worked as a research scientist intern at MosaicML since the summer of 2022.

My research is aimed at improving LLMs through strong empirical investigation of simple modeling changes. I believe that there are many uncomplicated research ideas which yield consistent performance gains that are overlooked for more complex and intricate solutions that are often brittle and do not scale. In pretraining I have investigated LLM based data filtering techniques as well as scheduling masked autoencoder objectives for the purpose of improving trained model quality. For inference methods I have proposed adding sequential dependence to draft heads to improve speculative decoding performance. My current research interest and projects involve improving LLM performance using dynamic inference compute as well as training improved preference models. While not my current research focus, I have worked on generative 3D modelling , neural surrogates of symbolic programs, and the science of ML in the past.

You can find my resume here.

Papers (* denotes equal contribution)

Scaling Laws For Precision

Tanishq Kumar*, Zachary Ankner*, Benjamin F. Spector, Blake Bordelon, Niklas Muennighoff, Mansheej Paul, Cengiz Pehlevan, Christopher RĂ©, Aditi Raghunathan

Preprint

Critique-out-Loud Reward Models

Zachary Ankner*, Mansheej Paul*, Brandon Cui, Jonathan D. Chang, and Prithviraj Ammanabrolu

Preprint

Code

Vid3D: Synthesis of Dynamic 3D Scenes using 2D Video Diffusion

Rishab Parthasarathy*, Zachary Ankner*, and Aaron Gokaslan

ICML 2024, CVG Workshop, Poster

Perplexed by Perplexity: Perplexity-Based Data Pruning With Small Reference Models

Zachary Ankner*, Cody Blakeney, Kartik Sreenivasan, Max Marion, Matthew L. Leavitt, and Mansheej Paul

Preprint

Hydra: Sequentially-Dependent Draft Heads for Medusa Decoding

Zachary Ankner*, Rishab Parthasarathy*, Aniruddha Nrusimha, Christopher Rinard, Jonathan Ragan-Kelly, and William Brandon

To appear at COLM 2024

Code

Striped Attention: Faster Ring Attention for Causal Transformers

William Brandon, Aniruddha Nrusimha, Kevin Qian, Zachary Ankner, Tian Jin, Zhiye Song, and Jonathan Ragan-Kelly

Preprint

Dynamic Masking Rate Schedules for MLM Pretraining

Zachary Ankner*, Naomi Saphra, Davis Blalock, Jonathan Frankle, Matthew L Leavitt

EACL 2024, Poster

3D Neural Field Generation using Triplane Diffusion

J.Ryan Shue*, Eric Ryan Chan*, Ryan Po*, Zachary Ankner*, Jiajun Wu, and Gordon Wetzstein

CVPR 2023, Poster

Project page, Code

The Effect of Data Dimensionality on Neural Network Prunability

Zachary Ankner*, Alex Renda, Gintare Karolina Dziugaite, Jonathan Frankle, and Tian Jin

NeurIPS 2022, ICBINB Workshop

EntailSum: An Entailment-Based Approach to Aspect-Based Text Summarization with Automated Aspect Adaptation

Zachary Ankner*, Purvaja Balaji, Ye Zhu, Chun Keat Hiew, Patrick Wang, and Amar Gupta

International Journal of Pattern Recognition and Artificial Intelligence