visarga
visarga t1_iy2w2kd wrote
Reply to comment by CypherLH in 2002 vs 2012 vs 2022 | how has technology changed? by Phoenix5869
> By comparison the leap from 2012 to 2022 seems smaller...
True, but this is also the golden period of AI. I think 90% of all AI research was done in the last 10 years.
visarga t1_iy2uc9o wrote
Young'uns I still remember 8bit processors in 1980s and loading programs from cassette tape. My father was still using IBM-style cards at work when I was a child, I messed up a whole stack playing with them. One card was a line of code. He had to sort it back by hand.
I think the biggest factor of change in the last 20 years was the leap in computing and communication speed. It took us from the PC era into the internet era. This meant an explosion in online media and indirectly allowed the collection of huge datasets that are being used to train AI today.
The things I've seen. I remember Geoffrey Hinton presenting his pre-deep-learning paper "Restricted Boltzmann Machines" around 2005. That instantly got my attention and I started following the topic, back then ML was a pariah. 12 years later I was working in AI. I have seen blow by blow from the front seat every step AI has made since 2012 when things got heated up. I read the Residual Neural Network paper the same day it was published, and witnessed the birth of transformer. I have seen GANs come and go, and even talked with the original author Ian Goodfellow right here on reddit before he got famous. I got to train many neural nets and play with even more. Much of what I learned is already useless, GPT-3 and SD are so open ended they make projects that took years take just weeks now.
Funny thing, when Hinton published the RBM paper he was using unsupervised learning. I thought it was very profound. But in 2012 the big breakthroughs were supervised learning (ImageNet). For five years only supervised learning got the attention and admiration. But in the last 5 years unsupervised won the spotlight again. How the wheel turns.
visarga t1_ixmbq7v wrote
Reply to comment by purple_hamster66 in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
AI is not that creative yet, maybe in the future, but how many mathematicians are? Apparently it is able to solve hard problems that are not in the training set:
> Meta AI has built a neural theorem prover that has solved 10 International Math Olympiad (IMO) problems — 5x more than any previous AI system.
> trained on a dataset of successful mathematical proofs and then learns to generalize to new, very different kinds of problems
This is from 3 weeks ago: link
visarga t1_ixigmah wrote
Reply to comment by RomanScallop in what does this sub think of Elon Musk by [deleted]
They are a dot on the edge of the space of knowledge.
visarga t1_ixig5d7 wrote
Reply to comment by Frumpagumpus in what does this sub think of Elon Musk by [deleted]
> honestly the key technical insights in the paper probably come from one or two people
The winning tickets in a lottery are just a few, but beforehand we don't know which ones. Hindsight is 20:20
visarga t1_ixiec41 wrote
Reply to comment by purple_hamster66 in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
The main idea here is to use
-
a method to generate solution candidates - a language model
-
a method to filter/rank the candidates - ensemble of predictions or running a test (such as in testing code)
Minerva - https://ai.googleblog.com/2022/06/minerva-solving-quantitative-reasoning.html
AlphaCode
- https://www.deepmind.com/publications/competition-level-code-generation-using-deep-language-models (above average competitive programmers)
FLAN-PaLM - https://paperswithcode.com/paper/scaling-instruction-finetuned-language-models (top score on MMLU math problems)
DiVeRSe - https://paperswithcode.com/paper/on-the-advance-of-making-language-models (top score MetaMath)
visarga t1_ixh56gc wrote
Reply to Meta AI presents CICERO — the first AI to achieve human-level performance in Diplomacy, a strategy game which requires building trust, negotiation and cooperation. by Kaarssteun
From AlphaGo to Diplomacy in just 6 years! They were saying the Go board is simple, everything visible and has a short list of possible actions, while real world has uncertainty, complexity and much more diversified actions. But Diplomacy has all that.
visarga t1_ixggjfm wrote
Reply to [R] Getting GPT-3 quality with a model 1000x smaller via distillation plus Snorkel by bradenjh
> Has anyone else tried something similar?
Trying it right now, but instead of using GPT-3 I am splitting the data like cross-validation and training ensembles of models. Ensemble disagreement =~ error rate.
visarga t1_ixfdcaj wrote
Reply to comment by TFenrir in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
I don't believe that, OpenAI and a slew of other companies can make a buck on cutting edge language/image models.
My problem with Google is that it often fails to understand the semantic of my queries replying with other content that is totally unrelated, so I don't believe in their deployed AI. It's dumb as the night. They might have shiny AI in the labs but the product is painfully bad. And their research teams almost always block the release of the models and don't even have demos. What's the point in admiring such a bunch? Where's the access to PaLM, Imagen, Flamingo, and other toys they dangled in front of us?
Given this situation I don't think they really align themselves with AI advancement, instead they align with short term profit making, which is to be expected. Am I making conspiracies or just saying what we all know - companies work for profits, not for art.
visarga t1_ixd6ygt wrote
Reply to comment by TFenrir in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
> I have absolutely no idea what you mean by "1-2 years late", in what way are they late?
GPT-3 was published in May 2020, PaLM in Apr 2022. There were a few other models in-between but they were not on the same level.
Dall-E was published in Jan 2021, Google's Imagen is from May 2022.
> Google is already looking at integrating language models
Yes, they are. But do a search and you'll see how poor the results are in reality. They don't want us to actually find what we're looking for, not immediately. They stand to lose money.
Look at Google Assistant - the language models can write convincing prose and handle long dialogues, in the meantime Assistant defaults to web search 90% of the questions and can't hold much context. Why? Because Assistant is cutting into their profits.
I think Google wants to monopolise research but quietly delay its deployment as much as possible. So their researchers are happy and don't make competing products, while we are happy waiting for upgrades.
visarga t1_ixd4ks5 wrote
Reply to [P] BetterTransformer: PyTorch-native free-lunch speedups for Transformer-based models by fxmarty
Does it include Flash Attention?
visarga t1_ixbka8a wrote
Reply to comment by TFenrir in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
They have a bunch of good models but they are 1-2 years late.
Also Google is standing to lose from the next wave of AI, from a business-wise perspective. The writing on the wall is that traditional search is on its way out, now more advanced AI can do direct question answering. This means ads won't get displayed. They are dragging their feet for this reason, this is my theory. The days of good old web search are limited.
But hey, you could say they might ask the language model to shill for various products. True, but language models can also run on the edge, so we could have our own models that listen to our priorities and wishes.
That was not something possible to do with web search, but accessible through AI. The moral of the story is that Google's centralised system is getting eroded and they are losing control and ad impressions.
visarga t1_ixa9md9 wrote
Reply to comment by purple_hamster66 in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
> They even used a Genetic Evolution algorithm to find new proofs, and got a few that no human had thought of before
This shows you haven't been following the state of the art in theorem proving.
> AGI is an illusion, although a very good one that’s useful.
Hahaha. Yes, enjoy your moment until it comes.
visarga t1_ixa8oko wrote
Reply to comment by Drunken_F00l in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
Look, you can prompt GPT-3 to tell you this kind of advice if that's your thing. It's pretty competent at generating heaps of text like you wrote.
You can ask it to take any position on any topic, the perspective of anyone you want, and it will happily oblige. It's not one personality but a distribution of personalities, and its message is not "The Message of the AI" but just a random sample from a distribution.
visarga t1_ixa7bi6 wrote
Reply to comment by entanglemententropy in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
By the moment we have AGI the world will be full of proto-AGI and advanced tool AIs. The AGI won't simply win the market, it has to compete with specialised tools.
visarga t1_ixa6sej wrote
Reply to comment by TFenrir in When they make AGI, how long will they be able to keep it a secret? by razorbeamz
Probably OpenAI. Google's been playing catchup as of late. I bet their best people from 2018 left long ago. For example almost all of the original inventors of the transformer left. This period of layoffs and resignations is seeding many startups.
visarga t1_ix7gaej wrote
visarga t1_ix2wyw6 wrote
Reply to comment by massimosclaw2 in [D] Are researchers attempting to solve the ‘omnipotence’ requirement problem in LLMs? by [deleted]
There is also prompt-tuning that will fine-tune only a few token embeddings keeping the model itself frozen. This changes the problem from finding that elusive prompt to finding a few labeled examples + fine-tuning the prompt.
Another approach is to use a LLM to generate prompts and filter them by evaluation. This has also been used to generate step by step reasoning traces for datasets that only have input-output pairs. Then train another model on the examples + chain of thought for a big jump in accuracy.
There's a relevant paper here: Large Language Models Can Self-Improve. They find that
> fine-tuning on reasoning is critical for self-improvement
I would add that sometimes you can evaluate a result, for example when generating math or code. Then you can learn from the validated outputs of the network. Basically what was used for AlphaZero to reach super-human level without supervision, but requires a kind of simulator - a game engine, a python interpreter, or a symbolic math engine.
visarga t1_ix2vuys wrote
Reply to [D] Are researchers attempting to solve the ‘omnipotence’ requirement problem in LLMs? by [deleted]
You mean like this? You just prepend "The following is a conversation with [a very intelligent AI | a human expert]". In image generation the trick is to add artist names to the prompt "in the style of X and Y", also called "style phrases" or "vitamin phrases".
Dall-E 2 was tweaked in a similar way to be more diverse when asking for a photo of a CEO, or other job, they would add various race and gender keywords. People were generally upset about having their prompts modified. But prepending the modifier on top by default might be useful in some cases.
If you want to extract a specific style or ability more precisely from a model you can fine-tune it on a small dataset, probably <1000 examples. This is easy to do using the cloud APIs, but not as easy as prompting.
visarga t1_ix0ji8d wrote
Reply to comment by Kolinnor in Why Meta’s latest large language model survived only three days online by nick7566
> it was utter trash and excessively arrogant
Galactica is a great model for citation retrieval. It has innovations in citation learning and beats all other systems. Finding good citations is a time consuming task when writing papers.
It also has a so called <work> token that triggers additional resources such as a calculator or Python interpreter. This is potentially very powerful, combining neural and symbolic reasoning.
Another interesting finding from this paper is that a smaller, very high quality dataset can replace a much larger, noisy dataset. So there's a trade-off here between quality and quantity, it's not sure which direction has the most payoff.
I'd say the paper was targeted for critique because it comes from Yann LeCunn's AI institute. Yann has some enemies on Twitter since a few years ago. They don't forget or forgive. There's a good video on this topic by Yannic Kilcher.
And by the way, the demo still lives on HuggingFace: https://huggingface.co/spaces/lewtun/galactica-demo
visarga t1_iwwdkai wrote
Reply to comment by Rodny_ in [P]Modern open-source OCR capabilities and which model to choose by Rodny_
Because it's a lucrative AI API for all the big players. Selling OCR for documents.
visarga OP t1_iwutuf1 wrote
This is good reading, will synthesize where we are and what's coming soon.
I'd like to add that training LLMs on massive video datasets like YouTube will improve their procedural knowledge - how to do things step by step, with applications in robotics and software automation. We have seen large models on text and images, but video adds the time dimension, there is audio and speech as well. Very multi-modal.
Action driven models are going to replace more and more human work, much more than the tool-AIs we have today. They will cause big changes in the job market.
Submitted by visarga t3_yym067 in singularity
visarga t1_iwlxe03 wrote
Reply to comment by snairgit in [P] Thoughts on representing a Real world data by snairgit
About representing your features - I would not feed float values directly to a neural net. I think you either need to discretise the values or to embed them like absolute positional embeddings in transformers. Or try using a SIREN on your float values directly.
visarga t1_iy3l7zy wrote
Reply to comment by User1539 in Google Has a Secret Project That Is Using AI to Write and Fix Code by nick7566
> What use will that be when you can describe your needs in a natural language to an AI and it will create the application for you?
Same thing happened to learning English - it used to be the smart choice, but now translation software removed that barrier.