Recent comments in /f/MachineLearning
generatorman_ai t1_jc5vsbw wrote
Reply to comment by extopico in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
T5 is below the zero-shot phase transition crossed by GPT-3 175B (and presumably by LLaMA 7B). Modern models with instruction and HF finetuning will not need further task-specific finetuning for most purposes.
oathbreakerkeeper t1_jc5viv0 wrote
Reply to comment by farmingvillein in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Who is emad? And who is SD?
generatorman_ai t1_jc5vc5r wrote
Reply to comment by generatorman_ai in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Probably I'm misinterpreting - you mean you did a batch size of 1 per GPU with 8 GPUs, so actually it's 48 GB with no optimizations (except fp16). That sounds more reasonable, though probably still too large for 16 GB with common optimizations by several gigs.
oathbreakerkeeper t1_jc5vbgx wrote
Reply to comment by farmingvillein in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
How was openais api used to bootstrap alpaca?
kittenkrazy t1_jc5v4is wrote
Reply to comment by generatorman_ai in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Training a Lora should be significantly cheaper especially combined with deepspeed cpu offloading and training with the model in 8 bit. Can probably get it to train on consumer cards.
And yup, completely separate unless you decide to merge them with the main model weights for faster inference/training another Lora on top/etc.
Hopefully people will share around loras for all sorts of plug and play personalities and finetuned abilities and it’ll be like stable diffusion but with personal assistants
sneakpeekbot t1_jc5upgp wrote
Reply to comment by pdaddyo in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Here's a sneak peek of /r/Oobabooga using the top posts of all time!
#1: The new streaming algorithm has been merged. It's a lot faster! | 6 comments
#2: Text streaming will become 1000000x faster tomorrow
#3: LLaMA tutorial (including 4-bit mode) | 10 comments
^^I'm ^^a ^^bot, ^^beep ^^boop ^^| ^^Downvote ^^to ^^remove ^^| ^^Contact ^^| ^^Info ^^| ^^Opt-out ^^| ^^GitHub
pdaddyo t1_jc5uoly wrote
Reply to comment by disgruntled_pie in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
And if you get stuck check out /r/oobabooga
generatorman_ai t1_jc5u7w2 wrote
Reply to comment by kittenkrazy in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Wow, 392 gigs for batch size 1? This is for 7B? That is an order of magnitude more than I was expecting. Sounds like even with full memory optimizations, we're far away from the 16 GB goal.
Good idea on the lora - since it's a completely separate set of weights I don't see how it could come under the license. In fact loras do work on weights different from the base model they were trained from (e.g. loras trained on base Stable Diffusion work when applied to heavily fine-tuned SD models), so it's not even necessarily tied to the LLaMA weights.
rpnewc t1_jc5u6xd wrote
Reply to [R] Training Small Diffusion Model by crappr
Check out Lucidrains great github repo. Works beautifully.
visarga t1_jc5teq6 wrote
Reply to comment by LessPoliticalAccount in [D] Are modern generative AI models on a path to significantly improved truthfulness? by buggaby
Then we need to only use a second model for strict fact checking, not creative responses. Since entailment is a common NLP task I am sure any LLM can solve it out of the box, of course with its own error rate.
kittenkrazy t1_jc5sesx wrote
Reply to comment by generatorman_ai in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
- Used accelerate fp16 mixed precision with deepspeed zero 2
- No xformers, no 8-bit Adam although I did test it and it works, no gradient checkpointing on this run but it does work.
- With a sequence length of 2048 I did a batch size of 1 with 8 gpus and accumulation of 4. This was on A6000s so 48 gigs of vram per card. Currently training a Lora on the 30B while training with the base model in 8-bit and can only fit 1 with a sequence length of 350. Once this one trains I’m going to try to set up a run with the model split up between the cards so I can crank up the sequence length. Will also be training the PPO phase so that will be a requirement to have enough vram lol.
- If you checkout the trlx repo they have some examples and they have an example of how they trained sft and ppo on the hh dataset. So it’s basically that but with llama. https://github.com/CarperAI/trlx/blob/main/examples/hh/sft_hh.py
- Just the hh directly. From the results it seems like it might possibly be enough but I might also try instruction tuning then running the whole process from that base. I will also be running the reinforcement learning by using a Lora using this as an example https://github.com/lvwerra/trl/tree/main/examples/sentiment/scripts/gpt-neox-20b_peft
- I’m also thinking maybe sharing lora weights instead of the direct model is a possible way around the license issue?
cthorrez t1_jc5s8ag wrote
Basically I would just make sure the metrics being compared are computed the same way. Same numerator and denominator like summing vs averaging, over the batch vs epoch. If the datasets are the same and the type of metric you are computing is the same it's comparable.
The implementation details just become part of the comparison.
mmmfritz t1_jc5s85o wrote
Reply to [D] Simple Questions Thread by AutoModerator
Fact checking. Any open source models or people working on fact checking?
Spiritual-Reply5896 t1_jc5s7ew wrote
Reply to [D] ChatGPT without text limits. by spiritus_dei
How is the similarity between synonyms or semantically similar sentences ensured if regex is used for retrieving the input prompts? Maybe I missed something as I skimmed over the paper, but that was the impression I got
mmmfritz t1_jc5s5r1 wrote
Reply to comment by Anthony-Z-Z in [D] Simple Questions Thread by AutoModerator
Jabrils. If you want to be swallowed up and eaten whole while you binge 20 hours on machine learning, it’s Jabrils.
blueSGL t1_jc5s56i wrote
Reply to comment by phire in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Less than $100 to get this sort of performance out of a 7B parameter model and from the LLaMA paper they stopped training the 7B and 13B parameter models early.
Question is now just how much better can small models get. (lawyer/doctor/therapist in everyone's pocket, completely private?)
[deleted] t1_jc5rrce wrote
Reply to [D] ChatGPT without text limits. by spiritus_dei
This is for sure not a problem solely solved by google brain researchers .
blueSGL t1_jc5rpta wrote
Reply to comment by v_krishna in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
could even have it regenerate the conversation prior to the vocal synt if the character fails to mention the keyword (e.g. map) in the conversation.
You know, like a percentage chance skill check. (I'm only half joking)
extopico t1_jc5revh wrote
Reply to comment by gwern in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Flan-t5 is good and flan-t5-xl runs well on 3060 in 8 bit mode. It’s not meant to be a chatbot however so that’s why it does not stir up so much excitement. T5 is best used for tasks and training it to handle specific domains. This makes it far more interesting to me than LLaMa which cannot be trained (yet) by us randoms.
sugar_scoot t1_jc5qogu wrote
What's the purpose of your study?
generatorman_ai t1_jc5q5z0 wrote
Reply to comment by kittenkrazy in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
That's great, it's been hard to find people who are actually fine-tuning LLaMA. Would you mind sharing your experience for the benefit of the open-source community?
- Did you train the full-precision weights?
- Did you use memory optimizations like xformers, 8-bit Adam (from bitsandbytes), gradient checkpointing etc.?
- How much VRAM does it take for a batch size of 1?
- hh seems to be a preference dataset for RLHF rather than a text corpus - how did you use it as a fine-tuning dataset?
- Did you first do instruction fine-tuning (using something like FLAN or Self-Instruct) or just the hh directly?
Disastrous_Elk_6375 t1_jc5pny8 wrote
Reply to comment by phire in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
> TBH, I'm pretty impressed for a 7B parameter model.
Same here. I've tried a bunch of prompts from a repo and the "follow the instruction" part seems pretty good and consistent. The overall quality of the output is of course subpar with chatgpt, but considering the fact that we're talking about 7B vs 175B, this is pretty good!
Disastrous_Elk_6375 t1_jc5pe77 wrote
Reply to comment by Disastrous_Elk_6375 in [R] Stanford-Alpaca 7B model (an instruction tuned version of LLaMA) performs as well as text-davinci-003 by dojoteef
Some more:
P: I want you to act as a prompt generator. Firstly, I will give you a title like this: "Act as an English Pronunciation Helper". Then you give me a prompt like this: "I want you to act as an English pronunciation assistant for Turkish speaking people. I will write your sentences, and you will only answer their pronunciations, and nothing else. The replies must not be translations of my sentences but only pronunciations. Pronunciations should use Turkish Latin letters for phonetics. Do not write explanations on replies. My first sentence is "how the weather is in Istanbul?"." (You should adapt the sample prompt according to the title I gave. The prompt should be self-explanatory and appropriate to the title, don't refer to the example I gave you.). My first title is "Act as a Code Review Helper" (Give me prompt only)
A: I want you to act as a Code Review Helper.
Prime: I want you to review the code written by another person. Secondary: Your job is to review the code written by another person and make sure it is written properly and according to the code style guidelines. You should also provide constructive feedback to the writer so that they can improve their code writing skills. Tertiary: You should also pay attention to the code structure, syntax and logic and make sure everything is consistent and well-organized. Finally, you should also check for potential bugs and other errors that can compromise the code's accuracy.
P: I want you to act as a text based adventure game. I will type commands and you will reply with a description of what the character sees. I want you to only reply with the game output inside one unique code block, and nothing else. do not write explanations. do not type commands unless I instruct you to do so. when i need to tell you something in english, i will do so by putting text inside curly brackets {like this}. my first command is wake up
A: When I wake up, I see a bright light streaming through the window and casting shadows on the wall. I can smell the fresh morning air and hear the birds chirping outside.
P: I want you to act as a self-help book. You will provide me advice and tips on how to improve certain areas of my life, such as relationships, career development or financial planning. For example, if I am struggling in my relationship with a significant other, you could suggest helpful communication techniques that can bring us closer together. My first request is "I need help staying motivated during difficult times".
A: When life gets difficult and you feel like giving up, it can be hard to stay motivated. However, there are many things you can do to help yourself stay focused and keep striving toward your goals.
First, it is important to recognize the source of your motivation. Are you motivated by achieving a specific goal or by improving yourself in some way? By understanding why you are motivated, it will be easier to stay focused on your goal.
Next, set smaller, achievable goals. Break your larger goals into smaller, manageable tasks. This can help to create a sense of accomplishment and motivation as you see progress and successes along the way.
Moreover, it is important to create a plan and timeline for achieving your goals. Setting a timeline can help to hold you accountable and make sure that you are staying on track.
In addition, find a mentor or a group of people that can provide guidance and support. Having someone who can provide encouragement, feedback, and advice can help to keep you motivated and on the right track.
Finally, remember that life is unpredictable. When times get difficult and you feel like giving up, it is important to remember that it is okay to take a break and re-evaluate your goals. Life is about the journey, not the destination, and taking a break can help to give you perspective and renewed
This is not bad at all!
DreamMidnight t1_jc5om1y wrote
Reply to [D] Simple Questions Thread by AutoModerator
What is the basis of this rule of thumb in regression:
"a minimum of ten observations per predictor variable is required"?
What is the origin of this idea?
FrogBearSalamander t1_jc5vvrb wrote
Reply to comment by currentscurrents in [D]: Generalisation ability of autoencoders by Blutorangensaft
> Would love to read some research papers if you have a link!