Our Initial benchmarks from NVIDIA AI's GH200 using NousResearch's Hermes 3 finetuned Llama model. 3k RPS with 300 concurrent users. This ASIC allows for efficient memory bandwidth transfer between CPU <> GPU via NVLinkC2C, and most inference engines are still WIP in maximizing this. So early!
关于我们
With Function's distributed generative AI, businesses save 80% on model hosting costs with infinite scalability and zero DevOps required. Function supports a vast range of models: use open-source or your private model.
- 网站
-
https://function.network
Function Network Labs的外部链接
- 所属行业
- 科技、信息和网络
- 规模
- 2-10 人
- 类型
- 私人持股
- 创立
- 2024
Function Network Labs员工
动态
-
Now giving the next generation NVIDIA GH200 a stress test run. What's new: Direct access to the CPU <> GPU allowing for 7x more bandwidth and more efficient cpu offloading. 900GB/s throughput. 4-5x more throughput then H100s for inference. Crazy cost and performance gains!
-
-
OpenAI have been reducing costs for their latest models by 30 to 50%. All this means is they went from $$$$ to $$$. It's still very expensive at the end of the day especially on output tokens. So what's the alternative? AWS Bedrock offers a comparable open source model, Llama 3.1 70B, for $0.99/mil input and output tokens which is significantly cheaper but even with this pricing, the bill can start to get out of hand as startups gain traction. I think usage will become much more feasible on these type of models at $0.40 to $0.50 / 1M input AND output tokens. So far, initial benchmarks on the AI engine I'm building shows that this is both possible and feasible as a inference provider. Is it possible that we crash the price of tokens? Yes but I believe access to AI should be a affordable commodity and that starts with reducing the cost of tokens.
-
I firmly believe that AI belongs to all of us. It should never be monopolized by a select few due to closed-source models or the high costs of testing and hosting models and agents. So, how do we transform AI from a tool for the few into a catalyst for the many? The answer is User-Owned AI. User-Owned AI prioritizes the user’s success and well-being, rather than focusing solely on generating revenue for companies. What does this mean? It means empowering individuals to host and run fine-tuned models while retaining full control over their creations. It means enhancing privacy and trust by eliminating the need for centralized access to model weights. It means reducing hosting costs and opening doors for innovators with limited budgets. It means simplifying deployment by removing technical barriers, enabling anyone to scale their AI solutions effortlessly. But most importantly… It means fostering a broader spectrum of people contributing to AI, leading to richer and more diverse innovations. The potential of AI is too great to be constrained by cost, complexity, or control. Every mind should have the tools to turn imagination into innovation, unlocking limitless possibilities. So let’s build on that now, not tomorrow.
-
A blank canvas with room for infinite imagination; With AI as an infinite f(x), our Image Captioning and Automatic Speech Recognition API will help users “see” and “hear” their words. Image Captioning APIs seamlessly bridge computer vision with natural language processing empowering your AI to better understand and describe images that leads to improved categorization, indexing, and more effective tagging and annotation. Automatic Speech Recognition API listens in real time, converting audio directly to text in seconds. Summarize everything from videos to real world speeches with instant note transcription. Translate to any language enabling natural and intuitive interactions with technology. With endless possibilities and infinite room for growth, AI allows us to conveniently connect and distribute information through the different spaces. Like, share, follow us, and sign up for our waitlist today at https://lnkd.in/gsCKRWJw
-
-
AI can be a limitless function as long as we can imagine it. A world of information improving with each prompt. But a world without relationships cannot be sustained. So let’s dive into our Text Embeddings and Stable Diffusion APIs to see how they help create that world. Text Embeddings are the foundation of RAG Pipelines. By splitting text into smaller units and mapping them to numerical vectors, we establish mathematical relationships that give meaning. This enables AI to classify, translate, and retrieve relevant search results more accurately. Text-to-image empowers the creation of images based on prompts. Our f(x) APIs offer state-of-the-art models like Stable Diffusion to fuel your creative ideas. Whether you’re creating concept art, marketing, social media, educational, or training materials, we’ve got you covered. The world of information is ours to create, learning and growing with each prompt given. Like, share, follow us, and sign up for our waitlist today at https://lnkd.in/g2KFHXYc Join us on this thrilling journey to learn more about our services and network specifications!
-
-
Here at f(x), we are creating multiple services that will help to empower your life. The first two out of five are Chat Completion and Code Completion. The days can be long, so let AI join you for a conversation in helping with your spaghetti code! Our Chat Completion APIs generate natural, context aware conversations powered by your popular open source models! Integrate intelligent chat assistants into your application with human-like interactions that handle complex dialogues! Finished chatting and ready to dive into coding? Load up our Code Completion APIs that are fully compatible with open-source AI assistants such as Continue to supercharge your coding abilities! We’ve all heard the phrase, “Work smarter, not harder”, so check in and join us tomorrow as we continue to introduce our upcoming services! Follow and visit our site to sign up for our waitlist today at https://lnkd.in/g2KFHXYc
-