Recent comments in /f/MachineLearning

FrogBearSalamander t1_jc5vvrb wrote

> Would love to read some research papers if you have a link!

2

kittenkrazy t1_jc5v4is wrote

Training a Lora should be significantly cheaper especially combined with deepspeed cpu offloading and training with the model in 8 bit. Can probably get it to train on consumer cards.

And yup, completely separate unless you decide to merge them with the main model weights for faster inference/training another Lora on top/etc.

Hopefully people will share around loras for all sorts of plug and play personalities and finetuned abilities and it’ll be like stable diffusion but with personal assistants

5

sneakpeekbot t1_jc5upgp wrote

4

generatorman_ai t1_jc5u7w2 wrote

Wow, 392 gigs for batch size 1? This is for 7B? That is an order of magnitude more than I was expecting. Sounds like even with full memory optimizations, we're far away from the 16 GB goal.

Good idea on the lora - since it's a completely separate set of weights I don't see how it could come under the license. In fact loras do work on weights different from the base model they were trained from (e.g. loras trained on base Stable Diffusion work when applied to heavily fine-tuned SD models), so it's not even necessarily tied to the LLaMA weights.

2

kittenkrazy t1_jc5sesx wrote

  1. Used accelerate fp16 mixed precision with deepspeed zero 2
  2. No xformers, no 8-bit Adam although I did test it and it works, no gradient checkpointing on this run but it does work.
  3. With a sequence length of 2048 I did a batch size of 1 with 8 gpus and accumulation of 4. This was on A6000s so 48 gigs of vram per card. Currently training a Lora on the 30B while training with the base model in 8-bit and can only fit 1 with a sequence length of 350. Once this one trains I’m going to try to set up a run with the model split up between the cards so I can crank up the sequence length. Will also be training the PPO phase so that will be a requirement to have enough vram lol.
  4. If you checkout the trlx repo they have some examples and they have an example of how they trained sft and ppo on the hh dataset. So it’s basically that but with llama. https://github.com/CarperAI/trlx/blob/main/examples/hh/sft_hh.py
  5. Just the hh directly. From the results it seems like it might possibly be enough but I might also try instruction tuning then running the whole process from that base. I will also be running the reinforcement learning by using a Lora using this as an example https://github.com/lvwerra/trl/tree/main/examples/sentiment/scripts/gpt-neox-20b_peft
  • I’m also thinking maybe sharing lora weights instead of the direct model is a possible way around the license issue?
5

cthorrez t1_jc5s8ag wrote

Basically I would just make sure the metrics being compared are computed the same way. Same numerator and denominator like summing vs averaging, over the batch vs epoch. If the datasets are the same and the type of metric you are computing is the same it's comparable.

The implementation details just become part of the comparison.

9

blueSGL t1_jc5s56i wrote

Less than $100 to get this sort of performance out of a 7B parameter model and from the LLaMA paper they stopped training the 7B and 13B parameter models early.

Question is now just how much better can small models get. (lawyer/doctor/therapist in everyone's pocket, completely private?)

15

extopico t1_jc5revh wrote

Flan-t5 is good and flan-t5-xl runs well on 3060 in 8 bit mode. It’s not meant to be a chatbot however so that’s why it does not stir up so much excitement. T5 is best used for tasks and training it to handle specific domains. This makes it far more interesting to me than LLaMa which cannot be trained (yet) by us randoms.

4

generatorman_ai t1_jc5q5z0 wrote

That's great, it's been hard to find people who are actually fine-tuning LLaMA. Would you mind sharing your experience for the benefit of the open-source community?

  1. Did you train the full-precision weights?
  2. Did you use memory optimizations like xformers, 8-bit Adam (from bitsandbytes), gradient checkpointing etc.?
  3. How much VRAM does it take for a batch size of 1?
  4. hh seems to be a preference dataset for RLHF rather than a text corpus - how did you use it as a fine-tuning dataset?
  5. Did you first do instruction fine-tuning (using something like FLAN or Self-Instruct) or just the hh directly?
6

Disastrous_Elk_6375 t1_jc5pny8 wrote

> TBH, I'm pretty impressed for a 7B parameter model.

Same here. I've tried a bunch of prompts from a repo and the "follow the instruction" part seems pretty good and consistent. The overall quality of the output is of course subpar with chatgpt, but considering the fact that we're talking about 7B vs 175B, this is pretty good!

10

Disastrous_Elk_6375 t1_jc5pe77 wrote

Some more:

P: I want you to act as a prompt generator. Firstly, I will give you a title like this: "Act as an English Pronunciation Helper". Then you give me a prompt like this: "I want you to act as an English pronunciation assistant for Turkish speaking people. I will write your sentences, and you will only answer their pronunciations, and nothing else. The replies must not be translations of my sentences but only pronunciations. Pronunciations should use Turkish Latin letters for phonetics. Do not write explanations on replies. My first sentence is "how the weather is in Istanbul?"." (You should adapt the sample prompt according to the title I gave. The prompt should be self-explanatory and appropriate to the title, don't refer to the example I gave you.). My first title is "Act as a Code Review Helper" (Give me prompt only)

A: I want you to act as a Code Review Helper.

Prime: I want you to review the code written by another person. Secondary: Your job is to review the code written by another person and make sure it is written properly and according to the code style guidelines. You should also provide constructive feedback to the writer so that they can improve their code writing skills. Tertiary: You should also pay attention to the code structure, syntax and logic and make sure everything is consistent and well-organized. Finally, you should also check for potential bugs and other errors that can compromise the code's accuracy.


P: I want you to act as a text based adventure game. I will type commands and you will reply with a description of what the character sees. I want you to only reply with the game output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. when i need to tell you something in english, i will do so by putting text inside curly brackets {like this}. my first command is wake up

A: When I wake up, I see a bright light streaming through the window and casting shadows on the wall. I can smell the fresh morning air and hear the birds chirping outside.


P: I want you to act as a self-help book. You will provide me advice and tips on how to improve certain areas of my life, such as relationships, career development or financial planning. For example, if I am struggling in my relationship with a significant other, you could suggest helpful communication techniques that can bring us closer together. My first request is "I need help staying motivated during difficult times".

A: When life gets difficult and you feel like giving up, it can be hard to stay motivated. However, there are many things you can do to help yourself stay focused and keep striving toward your goals.

First, it is important to recognize the source of your motivation. Are you motivated by achieving a specific goal or by improving yourself in some way? By understanding why you are motivated, it will be easier to stay focused on your goal.

Next, set smaller, achievable goals. Break your larger goals into smaller, manageable tasks. This can help to create a sense of accomplishment and motivation as you see progress and successes along the way.

Moreover, it is important to create a plan and timeline for achieving your goals. Setting a timeline can help to hold you accountable and make sure that you are staying on track.

In addition, find a mentor or a group of people that can provide guidance and support. Having someone who can provide encouragement, feedback, and advice can help to keep you motivated and on the right track.

Finally, remember that life is unpredictable. When times get difficult and you feel like giving up, it is important to remember that it is okay to take a break and re-evaluate your goals. Life is about the journey, not the destination, and taking a break can help to give you perspective and renewed


This is not bad at all!

4