Materials

Learning materials from Xiaohan


  1. LLM Hands-on Tutorial
  2. Deploy your NLP Model as an Interactive Web Application
  3. Fine Tuning BERT for Question Answering (QA) Task

Learning materials online

Listed in order of when the latest technology was updated.


  1. (NLP) Infographics using Large Language Models (08-28-2023)
    • LIDA is a library for generating data visualizations and data-faithful infographics. LIDA is grammar agnostic (will work with any programming language and visualization libraries e.g. matplotlib, seaborn, altair, d3 etc) and works with multiple large language model providers (OpenAI, PaLM, Cohere, Huggingface). 
    • https://github.com/microsoft/lida
  2. (NLP) Graph of Thought (08-24-2023)
  3. (NLP) Code Llama (08-24-2023)
    • Code Llama is a family of large language models for code based on Llama 2 providing state-of-the-art performance among open models, infilling capabilities, support for large input contexts, and zero-shot instruction following ability for programming tasks.
    • https://github.com/facebookresearch/codellama
  4. (audio) Seamless M4T
  5. (NLP) LlamaGPT (08-21-2023)
  6. (multimodal) CoDeF: Content Deformation Fields for Temporally Consistent Video Processing (08-20-2023)
  7. (AI simulation) Generative Agents: Interactive Simulacra of Human Behavior (08-13-2023)
  8. (NLP) gpt-llm-trainer (08-07-2023)
    • This project aims to explore an experimental new pipeline to train a high-performing task-specific model. We try to abstract away all the complexity, so it’s as easy as possible to go from idea -> performant fully-trained model.
    • Simply input a description of your task, and the system will generate a dataset from scratch, parse it into the correct format, and fine-tune a LLaMA 2 model for you.
    • link: https://github.com/mshumer/gpt-llm-trainer
  9. (multimodal) Awesome-Multimodal-Large-Language-Models
  10. (audio) Whisper:
    • OpenAI trained and are open-sourcing a neural net called Whisper that approaches human-level robustness and accuracy on English speech recognition.
    • https://openai.com/research/whisper
  11. (NLP) Doctor GPT (08-06-2023)
    • DoctorGPT is a Large Language Model that can pass the US Medical Licensing Exam.
    • link: github repo

Practical/Interesting Information