![]() callback: the streaming callback function that gets called every time the client gets any token response back from the model.Set skip_end: true and the response will no longer end with \n\n However sometimes you may not want this suffix. skip_end: by default, every session ends with \n\n, which can be used as a marker to know when the full response has returned.n_predict: The number of tokens to return (The default is 128 if unspecified).threads: The number of threads to use (The default is 8 if unspecified).if specified (for example ws://localhost:3000) it looks for a socket.io endpoint at the URL and connects to it.if unspecified, it uses the node.js API to directly run dalai locally.url: only needed if connecting to a remote dalai server.model: (required) The model type + model name to query.If your mac doesn't have node.js installed yet, make sure to install node.js >= 10Ĭurrently supported engines are llama and alpaca. You can optimize this if you delete the original models (which are much larger) after installation and keep only the quantized versions. The following numbers assume that you DO NOT touch the original model files and keep BOTH the original model files AND the quantized versions. Let's take a look at how much space each model takes up: ![]() You do NOT have to install all models, you can install one by one. The model name must be one of: 7B, 13B, 30B, and 65B. You need a lot of space for storing the models. Unless your computer is very very old, it should work.Īccording to a llama.cpp discussion thread, here are the memory requirements:Ĭurrently only the 7B model is available via alpaca.cpp 7BĪlpaca comes fully quantized (compressed), and the only space you need for the 7B model is 4.21GB:
0 Comments
Leave a Reply. |
Details
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |