Alireza Fathi profile photo

Alireza Fathi

Senior Staff Research Scientist / Manager at Google DeepMind

Computer VisionMachine LearningNeural RenderingGenerative ModelsRoboticsResearch Leadership

About

I'm Alireza Fathi, a Senior Staff Research Scientist and Manager at Google DeepMind. My career has been dedicated to pushing the boundaries of Computer Vision and Machine Learning, with a journey that spans from my PhD at Georgia Tech to research roles at Apple and Stanford. Currently, I lead teams focused on 3D scene understanding, generative models, and multimodal LLMs, contributing to products like Waymo, Google Maps, and Lens. I am deeply passionate about collaborative research and building technologies that bridge the gap between vision and language. Beyond the lab, I'm a supporter of Manchester United and an advocate for the research community. I'm always looking to connect with talented researchers and interns who are eager to achieve SOTA results and solve complex spatial AI challenges.

Networking

What I can offer

  • Deep expertise in 3D vision and multimodal AI
  • Insights into Google DeepMind research culture
  • Mentorship and internship opportunities
  • Professional support and resources

Looking for

  • Research Scientists and Interns with expertise in Multimodal AI
  • Collaborators in Video Understanding
  • expanding my professional network
  • exploring mutual opportunities in Artificial Intelligence

Best fit for

AI ResearchersComputer Vision PhD StudentsMachine Learning EngineersTech Industry Professionals

Current Interests

Multimodal LLMs3D Scene SynthesisAutonomous SystemsObject-centric Neural RenderingManchester United

Background

Career

Progressed from PhD research and internships at Microsoft and Disney to a Research Engineer role at Apple, followed by a postdoctoral position at Stanford, eventually leading to a long-term research leadership tenure at Google DeepMind.

Education

Ph.D. in Computer Science and M.Sc. in Machine Learning from Georgia Institute of Technology; M.Sc. in Computer Science from Simon Fraser University; B.Eng. in Computer Software Engineering from Sharif University of Technology.

Achievements

  • Achieved SOTA results in 3D instance segmentation on ScanNet and S3DIS
  • Released TensorFlow 3D framework
  • Developed AVIS, an LLM agent for visual information seeking
  • Contributed research to Waymo, Google Maps, and Google Lens
  • Developed real-time surface normal prediction for mobile hardware

Opinions

  • High value placed on a track record of strong papers at top ML and CV conferences
  • Strong advocate for collaborative research and team-based success
  • Commitment to community support during industry layoffs