Covering Scientific & Technical AI | Friday, November 22, 2024

Combat-happy Meta CEO Aims for Responsible and Safe AI with New Llama 2 Model 

Meta CEO Mark Zuckerberg wants to punch Elon Musk in the face but is also looking to beat down AI rivals with his company's latest large-language model.

The Llama 2 model, which was announced by Meta on Tuesday, joins a rash of open-source software AI available for users to download.

The model is available for free, and the model weights and tokenizers are available for download. But users need to fill out a download request as Meta wants to keep this AI model away from the hands of evildoers.

EnterpriseAI made a download request, but Meta first evaluates the request. We received a download link within two hours of filling out the forms. But once Meta approves the request, it can be downloaded via Github or HuggingFace.

The Llama 2 model is like ChatGPT – it was trained on available information on the Internet and can answer questions. Llama 2 is exclusively a chatbot, and once compiled, provides a prompt in which users can ask questions and compile stories.

(AVIcon/Shutterstock)

But Meta, which is the king of offering chat interfaces through its Facebook, WhatsApp, and newly developed Threads application, isn't offering a direct interface for users to try Llama 2 out.

"We're not offering Llama in a chat interface. Researchers or organizations would have to build their own interface on top of the model," a Meta spokesperson told EnterpriseAI.

Meta may have refused to offer a chat interface for Llama 2 as it may not want to deal with the backlash of hallucinating and biased chatbots presented by OpenAI and Microsoft, which had to put safety mechanisms in place. Meta has a history of political backlash for its controversial role in trying to sway public dialog.

Meta’s goal was to produce a transformer model that was safe at the outset. Custom implementations of Llama 2 on PCs will not be censored by Meta. Users can report problematic output of Llama 2 to Meta via the company’s website.

Users can also use Llama 2 via cloud services like Amazon Web Services and Microsoft. On Amazon, it is available via SageMaker JumpStart, which is widely used by Fortune 500 companies to test and develop AI models.

Meta claims the new transformer model is smarter than its predecessor, Llama 1, as it can reason better and provide more relevant answers. Llama 2 was trained on 40% more data than Llama 1, which reduces the instances of hallucinating or incorrect answers.

"We have taken measures to increase the safety of these models, using safety-specific data annotation and tuning, as well as conducting red-teaming and employing iterative evaluations," Meta researchers said in a paper outlining Llama 2.

(Ole.CNX/Shutterstock)

It took six months to train Llama 2, which included the pre-training and inputs based on human feedback. The training surprisingly relied heavily on supervised training at various levels of finetuning, which is a thumbs down on unsupervised training techniques, which was used to train GPT-4.

"The tuned versions use supervised fine-tuning and reinforcement learning with human feedback to align to human preferences for helpfulness and safety," Meta researchers wrote in the paper.

The model training also included techniques reminiscent of decision tree steps, in which less important data is rejected as part of the finetuning process.

The model comes with parameters ranging from 7 billion to 70 billion parameters. Meta claimed Llama 2 had better reasoning capabilities than open-source AI transformer models including Falcon and MosaicML's MPT with a comparable number of parameters.

But Llama 2 is not necessarily better than closed-source AI transformer models like GPT-4.

"They also appear to be on par with some of the closed-source models, at least on the human evaluations we performed," the researchers wrote in the paper.

The model was trained at Meta’s Research Super Cluster on Nvidia A100 GPUs with 80GB of memory. The training was completed in 3.3 million GPU hours.

AIwire