Zhenxiao Luo

Senior Software Engineer, Pinterest

Zhenxiao Luo is a Sr. Staff Software Engineer at Pinterest, responsible for big data real-time processing engine, monitoring platform, and big model data preprocessing. Prior to joining Pinterest, Mr. Luo worked at Cloudera, Uber, Twitter, and Facebook, where he was responsible for the development and operation of big data engines and machine learning platforms. Zhenxiao Luo is a Presto committer, Presto Technical Steering Committee member, and graduated from Fudan University with a bachelor's degree, and graduated from University of Wisconsin Madiso with a PhD on leave.

Topic

Build a customized in house VectorDB at Pinterest

Since 2023, Pinterest has been internally designing and implementing tools and systems to serve large language models, including VectorTableSearch, Text-To-SQL in QueryBook, and automated table documentation.Pinterest uses the industry's Pinterest uses industry generic GPT models, such as openAI. To increase the flexibility of the system and to add Pinterest's own features, we make extensive use of Retrieval Augmented Generation (RAG) to add Pinterest-specific elements to the large model, such as engineering documentation, query history, and automated table documentation. We heavily use Retrieval Augmented Generation (RAG) to add Pinterest-specific elements to the larger model, such as engineering documentation, query history, and Slack chats. In this process, we faced many challenges, such as the lack of standards for training data, which makes it difficult to reuse. Lack of VectorDB development and operation and maintenance, and each team often works separately. Inability to perform continuous optimization of large models, and so on. To solve the appeal problem, we designed and developed VectorDB as a service within Pinterest, including a unified data import platform for the company, support for data indexing and updating, and high-performance query services. This talk will detail the VectorDB service within Pinterest, and the difficulties in customer service during the process of building VectorDB. Translated with DeepL.com (free version)

© boolan.com 博览 版权所有

沪ICP备15014563号-6

沪公网安备31011502003949号