Harbin Institute of Technology Shenzhen team enters the multimodal big model market, and its self-developed "Ruoyu-Jiutian" tops the OpenCompass list

首頁 > Explore > News Center > Harbin Institute of Technol≥™αogy Shenzhen team enters the mu ♣ltimodal big model marke↓™‍εt, and its self-developed &quo€€t;Ruoyu-Jiutian" tops the Ope¥®nCompass list

Harbin Institute of Technology Shenλ÷∞zhen team enters the multimodal b≠✔ig model market, and its self-develo↔♠≤ped "Ruoyu-Jiut≠ ian" tops the OpenCompass list

Release time: 2023-08-09

Reprinted from 36Kr, author: Ben↓×♠@36KR

Ruoyu-Jiutian achieves ↕ multimodal fusion of text, images, au₹dio and video

36Kr learned that the teamδ↑‍ of the Computing and Inte‍β®∑lligence Research Institute of Ha® rbin Institute of Technology (Shenzhen)δ→ has established a multimodal larg≤εe model research and developmen‌•★∞t enterprise, Shenzhen Ru¶β×βoyu Technology Co., Ltd. (herei¶‌βnafter referred to as "φ ×♦;Ruoyu Technology&quo♣∏↑t;), relying on the school's∑× Harbin Asset Managementβ×>← Co., Ltd. to transform itsφ✘ achievements. Ruoyu Technolog↓≈₽y's first multimodal large mod$≥₹el "Ruoyu-Jiutian" top∞®δ ped the OpenCompass multimodal lar♥↑<>ge model list in its fi≠αrst participation.

640.webp.jpg

Multimodal large model MMBench tes®♦t list

01 "Ruoyu-Jiutian"

"12.3 billion paramet₽±♣∞ers", "120 million image-t€∑ext pairs", "∞‌←α5.5 million Chinese-English bilingua₽ ✔l corpus samples", "↓&α;1.2 million fine-tuning data sample"±₩>s", "500,000 enhanced d¶€ata samples"... The improvemen₩™←↕t of core parameters has brou¶♠₩¥ght about a qualitative change in model ∑£ capabilities. The Ruoyu-Jiutian mu÷&λltimodal large model has achieved rema>§§rkable performance in logical reasoΩ"×₹ning, relational reas<"oning, and perception capabilit♦±ies. With more than 10 billio‌÷n parameters, Ruoyu-Jiutian has achie φ↕ved multimodal fusion of text, images↕€♦π, audio, and video. Its intelli×₽gent understanding and res₹©ponse capabilities not onl→¥'∞y cover fields such as na‍≠tural language processing, comp✔≥uter vision, and speech recogni→♠tion, but also more effectively break d♣‌own the information barriers betwee'€n modalities, integrating the§δ↓ m into "Jiutian&q♥∑‌uot;.

640.webp (1).jpg

Multimodal large model MM↕≈ Bench dev list

"The Nine Heavens represents<≤ π the highest heaven in ancie§nt Chinese mythology∑, and symbolizes our infinite purs€"uit of technological progress € ✘∑and our yearning for$π↔β an intelligent future. W∞‌₩₩ith its powerful understandi♣♣ ng and response capaΩ☆₽αbilities, this model '↑ transcends the boundaries of multiple ₽÷>modes such as text, images, a ≠←udio and video, and achieves tr✔←ue multimodal fusion,&quo€↓✔®t; said Dr. Sun Teng, CEO of Ru€₹<oyu Technology.

02 Establishing a top team for l‍₽arge models

Harbin Institute of Technology Sλ•✘henzhen Campus has established an asse♥♣δt company to encourage facult±‍♥y and staff to transform and impl±¶>∞ement their research res±λ©ults. Harbin Institutπ✘e of Technology (Shenzhen) has policy •®support for the implementation of indu÷±stry-university-research co♦↔★£operation. When Ruoyu✘®"♦ Technology was first established, ®₹★×the school participate₹λφd as a start-up shareholde"φr, providing strong suppo$£γrt for the company'" "s development.

Recently, IEEE Intelligen¥✔t Systems, a well-known ♣σ®"magazine in the field ÷™® of artificial intell®↓≤igence, announced the list of "Aσ×∑I's 10 to Watch" in 2022.π₽ Professor Nie Liqiang was listed a$<¶♣mong them for his co∏Ω£ntributions in the field oε¶↕®f multimodality. Professor Nie is the '$×winner of the Damo Aca₹±→demy Young Orange Award and th↑↑e TR35 China Award. ≠ He said that the achieveme®∞nts of HIT-Shenzhen ‍¶in the field of artificial intelligenc↔δ÷σe cannot only exist in the laborat©¥ε★ory, but must be transformed to €≠©♠serve national defense, aerospa¥≈♣ce, and society.

Another AI expert at Ruoyu Technology ×♥εis co-founder Professor Zhan™ ≥Ωg Min. Professor Zhang is a specially¶±♠‍ appointed assistant to the pre±∏∞∏sident of Harbin Institute of Techno×'↓logy (Shenzhen), the first outs↑β∞tanding young scholar in the fiδeld of NLP in China, one o★ε±•f the national "Million ₽λ¶Talents", a young and middle-a↔∑ged expert with outstanding co£&→ntributions to the country,£ and enjoys a special allowance★←§ from the State Counci ↔≤l. Harbin Institute of Te≥πchnology ranks first a£♦>mong Chinese research insδδ♥titutions in the field of NLP in t♠ he authoritative compute₽♦≥r science list CSRankings (2022-2>‍023), and Professor Zhang is the pers‌•♥on who has made the gr₹≥♦eatest contribution to this field at← Harbin Institute of Techno'βlogy.

640.webp (2).jpg

Harbin Institute of Technology r✘♠ ₽anks first among institutions in mainla®‍nd China in the field of NLP →'™≠in CSRankings

640.webp (3).jpg

Teacher Zhang Min ranked first in ≤↑the academic contributi¶ ™on list

Dr. Sun Teng, co-founder and CEO>α÷ of Ruoyu Technology, is also a core ex₩¥♥pert of the company's R&δ&;D team. Dr. Sun's®>¶ research direction has÷δ always focused on multimedia computi♣ ng, and related results haveγγ" been published in CCF Class ←"∑A conferences and IE↕ε≤EE/ACM Trans. Dr. Sun has previously h£≤≤ad successful entrepreneurial exper™>♠ience and has full-process experien↑¶ce and company managemeλ•♠nt experience in theπ> application of artificia≈₹≥<l intelligence technology ÷α ♠in vertical fields. Geng Chen, another↕↔↕ co-founder of Ruoyu Technology, servα™αes as the company's★₩≤' strategic advisor. Hε™∏e has been named the <₹♣©best technology analyst by New For$÷tune many times and has acc‍☆umulated rich industr§↑y resources in his man∏≈≥÷y years of research ca Ωβreer. He is responsible for the companσ$≤Ωy's investment and financing☆≤ and the docking and landing of industr₩αy resources.

03 Core Competencies of Ruoyu Tec∏♦hnology

"Ruoyu Technology was es→&↓tablished at this point in tim≠★σe with its historical mission a¶∏≥nd ideals. As cutting-edge ♣₩ R&D personnel, we can deeply feel ™¶σ the changes that artif♣ε♦☆icial intelligence will $bring to the future society↓&≤¶. The productivity explosion♠± brought about by generative artificia••‌•l intelligence will red↕∞₩≈efine the production relations in al$∞≤™l walks of life. It is our honor and✔¥ mission to have the opp∏‌ortunity to participate in β≤™•it."

Computing power, data and talent are →λ÷the three major barriers to entry fo≈←↕★r big models. Ruoyu Technology has gaσα♣↑thered these core element‌¶s since its inception♠§β. The endogenous R&D team that cu&♣<ltivates leading talents h♣♠as formed independent iteration capabi↔♦★εlities. In the future, &q$§→uot;Ruoyu-Jiutian" will ε© ✔continue to iterate under the leadersπ★↔hip of technical experts.

With its top entrepreneurial team,♠≠↑↔ core capabilities of self-develφΩδoped multimodal larg→‍₹₩e models, and successful implementat→&©₹ion experience, Ruoyu✔λ Technology says it will bri∑¶<ng a touch of brilliance ★>to the "Battle of 100 Models"→♦.

04 Build a universal AI large m→♠δ•odel foundation

It has become an industry conse¶♦©nsus to reshape each↓∑φ track based on large mo✔ ™del capabilities. According to Opδπα enAI's development path, σ when the model is large enough, new c$δ¶apabilities will emerge, espe¶π☆₽cially some capabilitie ÷÷s that have never beenΩ® seen before.

Ruoyu-Jiutian will continue t☆ &o iterate in the future. Dr¥®>π. Sun Teng said: "Ru ¥oyu-Jiutian is still iterating in twε'β±o opposite directions: ≥λ≥¥bigger and smaller. On the on'€βe hand, it is increasing the mλφ♠÷agnitude of parameters and×σ exploring nodes that suppo↔® rt the emergence of general multi-moda×λl large models; on the othe‍↔r hand, to meet the applica≠>÷tion needs of industry users an♥÷∞←d achieve the greatest ef♥γ‌fect with the least computing power♥£≠, what must be done is ₽©to lightweight compress large σπmodels and finally combine them≈↔ with edge computing d≤♥‍evices."

Based on the multimodal big model bas₹★®≈e of "Ruoyu-Jiutian", Ruoyu&#↔β♣39;s business model <★• is fundamentally different♠☆ from the AI 1.0 era. In the•★ past, the business mode←±l had to re-develop algorithms€≠ for each demand, which was a com₹®plete project-based syste★®m. "Ruoyu-Jiutian&©φ≤quot; is a unified multimodε♠ al big model foundation€✔×♥. It does not need to re✔₩₹☆design the base. It on↑∞ly needs to be fine-tuned according to'♦$ different data in the δ♣☆¥industry to get the corresp↓σ≈€onding industry mode§π↓l. Customers can even use da ™≤γta to make secondary fin→ ♥e-tuning according to ✔ the needs of the segmented field.

The difficulty of multimodal large ®≠>models lies in the fusion of multimod<γ∞al information. Common fusio× ₩÷n methods include relativ¶γely crude means such as©≤→ linear superposition and ₽βcascading, but the fin↑®§al effect is often not as good a↕¶£s the performance of®λ• a single modality. This is because ¶♣∑some technical teams lack the exp™αerience and ability to adjust™₽←Ω multimodal data, and fusion and a ‌←♥lignment of multimodal features. Ru$←εεoyu-Jiutian has a self™''φ-developed full-chain model training f'§ φramework for multimod×☆→±al feature extraction, alignme©‌nt, fusion, and reasoning, as we×π ll as a comprehensive anφφd detailed multimodal data ✘'£∏collection and cleaning proc‍≈★ess. The model topped the multimoda≤¥ l large model list, proving the t✔'✔eam's leading str↕φ§ength in multimodal large models.

Robots are system-level applicat↓αion products in the industrial fi‌φeld and are the key φ©landing direction of the &qπ uot;Ruoyu-Jiutian" multimodal≤& large model base. Harbin Institute ∑≤of Technology currently has a deeδ♣p accumulation of industr y-university-research coε£↑λoperation in the field o‌↔‌f robotics. In the fu±≤✔'ture, embodied robots will need to in≈€§♣tegrate multimodal information§<≤≠ such as voice, vision, deλ♥∏cision-making, and control to form a ±×∏closed loop. The "Ruoyu-Jiutian&q♣β¶πuot; multimodal large model ba ↑✘✘se will conduct further reseα★'arch integration based on Harbin Ins↓≈titute of Technology's accumula'♣®≤ted research on robot∑♣s, and has currently carried‍★≥ out in-depth cooperation wπβ ith many large listed compani✘←×¶es in the consumer electron↕πics/automotive fields.

With the "Ruoyu-Ji©¶✘utian" multimodal large model ‍±base, Ruoyu Technology has t↔± ™he ability to fine-tune the ex∞↓isting multimodal la÷<♥rge model base to provide personaliz≤♠¶ed and customized services to users δ ‍in different fields, and provide lanδ‍guage pre-trained large modelδ±ε>s, multimodal pre-trai¶ ned large models, ve™™¶↔rtical field pre-trained large m' odels and other capabilities ®™, and is committed to building t☆₽¶₩he future AI general platform and infr✔'♥‌astructure.

none

Ruoyu Technology: A leader★$ in robot "brains" base←∏↓d on multimodal large models

Harbin Institute of Technology Shenλ÷∞zhen team enters the multimodal b≠✔ig model market, and its self-develo↔♠≤ped "Ruoyu-Jiut≠ ian" tops the OpenCompass list

news