국가대표 AI 선발전, '1차 평가'서 네이버·NC 탈락:브레이크뉴스

▲ 과학기술정보통신부가 '독자 AI 파운데이션 모델' 프로젝트 착수식을 갖고 K-AI 명칭을 공식 부여했다. (사진=과기정통부 제공) © 브레이크뉴스

브레이크뉴스 정민우 기자= 정부가 추진 중인 독자 인공지능(AI) 파운데이션 모델 프로젝트 1차 단계평가 결과 LG AI연구원과 SK텔레콤, 업스테이지가 통과했다. 네이버클라우드와 NC AI는 탈락했다. 당초 4개 팀을 선정할 계획이었으나, 객관적 성능 지표뿐 아니라 모델의 독자성 평가가 결정적 기준으로 작용하면서 3개 팀만 2차 단계에 진출하게 됐다.

과학기술정보통신부는 정보통신산업진흥원(NIPA), 한국정보통신기술협회(TTA)와 함께 독자 AI 파운데이션 모델 프로젝트 1차 단계평가 결과를 15일 발표했다.

이번 평가는 LG AI연구원, SK텔레콤, 네이버클라우드, 업스테이지, NC AI 등 5개 정예팀을 대상으로 진행됐다. 과기정통부와 NIPA, 정예팀은 수차례 심층 논의를 거쳐 벤치마크 평가, 전문가 평가, 사용자 평가를 종합해 1차 단계평가를 실시했다. 평가 항목에는 AI 모델 성능을 가늠하는 ‘AI 프론티어 인덱스’와 실제 활용 가능성, 비용 효율성, 국내외 AI 생태계 파급효과 등을 반영한 ‘AI 디퓨전 인덱스’가 포함됐다.

벤치마크 평가에서는 LG AI연구원이 가장 우수한 성적을 기록했다. NIA 벤치마크 평가에서 SK텔레콤과 LG AI연구원이 각각 10점 만점에 9.2점을 받았고, 글로벌 공통 벤치마크 평가에서는 LG AI연구원이 20점 만점 중 14.4점으로 최고점을 획득했다. 글로벌 개별 벤치마크 평가에서는 업스테이지와 LG AI연구원이 10점 만점에 10점을 받았다. 이를 종합한 벤치마크 점수에서도 LG AI연구원이 33.6점으로 1위를 차지했다.

전문가 평가는 산학연 외부 AI 전문가 10명으로 구성된 평가위원회가 장기간 심층 분석을 통해 진행했다. 각 팀이 제출한 테크니컬 리포트와 모델 훈련 로그 파일 등을 토대로 기술력과 개발 과정, 독자성 등을 평가한 결과, LG AI연구원이 35점 만점 중 31.6점으로 최고점을 기록했다.

AI 스타트업 대표 등 49명의 전문 사용자가 참여한 사용자 평가에서도 LG AI연구원은 실제 현장 활용 가능성과 추론 비용 효율성 측면에서 25점 만점 중 25점을 받아 가장 높은 점수를 받았다.

이처럼 벤치마크, 전문가, 사용자 평가를 종합한 결과 LG AI연구원, SK텔레콤, 업스테이지가 2차 단계 진출팀으로 선정됐다. 네이버클라우드는 종합 점수 상위권에 포함됐으나, 독자 AI 파운데이션 모델 기준을 충족하지 못했다는 평가가 결정적이었다. 전문가 평가위원들은 네이버클라우드 모델이 해외 모델 의존도 측면에서 독자성 한계가 있다고 지적했다. NC AI는 사용자 평가 등에서 상대적으로 낮은 평가를 받았다.

업계에서는 NC AI가 산업 최적화 모델을 내세웠음에도 불구하고, 실제 사용성 면에서 최하위 점수를 기록하며 전반적인 점수 경쟁에서 밀려난 것으로 보고 있다. '글로벌 톱' 수준의 모델 확보를 목표로 하는 프로젝트에서 기술적 완성도와 실용성 모두 합격점을 받지 못했다는 것이다.

네이버클라우드의 경우 '프롬 스크래치(From Scratch)' 논란으로 발목이 잡혔다. 프롬 스크래치는 모델 설계부터 사전학습까지 전 과정을 독자적으로 수행했는지를 따지는 것을 말한다. 네이버가 공개한 기술보고서에서 비디오·오디오 인코더 등의 가중치(Weight)를 초기화하지 않고 외부 오픈소스 모델의 것을 그대로 활용한 점이 전문가 평가에서 결정적인 결격 사유로 작용했다.

한편, 정부는 당초 1차 평가전을 통해 4개 팀을 추리려 했으나, 2곳의 컨소시엄이 탈락하면서 계획 수정이 불가피해졌다. 정부는 추가 선발전을 통해 1곳을 뽑아 다시 독파모 프로젝트를 이어간다는 방침이다.

기존 탈락 팀과 신규 도전 기업 모두에게 문호를 개방해 총 4개 정예팀 체제를 구축할 계획이다. 추가 선정 팀에는 GPU와 데이터 지원, ‘K-AI 기업’ 명칭 부여 등 다양한 혜택이 제공된다.

과기정통부는 또한 2차 평가와 추가 공모 과정에서 독자성 가이드라인을 더욱 구체화할 계획이다. 큰 틀에서의 평가 방식이나 기준은 유지하되, 프롬 스크래치와 관련해서는 학계·업계 전문가 의견을 수렴해 보다 구체적인 차등 배점 기준을 마련해나간다는 방침이다.

류제명 과기정통부 2차관은 "이번 1단계 평가의 교훈으로 2단계에 들어갈 때는 출발선상에서부터 좀더 불확실성을 최소화하겠다"며 "이번 프로젝트는 대한민국이 글로벌 AI 경쟁에서 독자적 기술로 당당히 맞서기 위한 역사적 도전이다. 대한민국이 글로벌 AI 기술 경쟁의 선두에, 선두 대열에 설 수 있도록 가용한 모든 국가 역량과 자원을 집중하는 등 글로벌 톱 수준의 모델 개발을 위한 기술혁신 경쟁을 계속 이어나갈 것"이라고 거듭 강조했다.

한편 이번 프로젝트는 글로벌 AI 모델 의존도를 낮추고, 설계부터 사전 학습까지 전 과정을 자체 수행하는 독자 AI 파운데이션 모델을 확보하기 위해 추진됐다. 과기정통부는 공모 단계부터 해외 모델을 단순 미세조정한 파생형 모델이 아닌, 라이선스 이슈가 없는 국산 모델을 독자 AI로 명확히 규정해왔다.

기술적 측면에서는 독창적인 모델 아키텍처 설계, 대규모 데이터의 자체 확보·가공, 독자적 학습 알고리즘 적용을 통한 전 과정 학습 수행을 독자성의 핵심 요건으로 제시했다. 정책적으로는 국방·외교·안보, 국가 인프라 분야에서 외산 AI 의존으로 발생할 수 있는 위험을 줄이기 위해 자주권과 통제권을 갖춘 AI 역량 확보를 목표로 삼았다. 윤리적 측면에서는 오픈소스 활용 시 라이선스 준수와 투명성 확보도 강조했다.

*아래는 위 기사를 '구글 번역'으로 번역한 영문 기사의 [전문]입니다. '구글번역'은 이해도 높이기를 위해 노력하고 있습니다. 영문 번역에 오류가 있을 수 있음을 전제로 합니다.<*The following is [the full text] of the English article translated by 'Google Translate'. 'Google Translate' is working hard to improve understanding. It is assumed that there may be errors in the English translation.>

Naver and NC Dinos Eliminated in First Round of National AI Selection

The government's independent artificial intelligence (AI) foundation model project, currently underway, resulted in the first round of evaluations: LG AI Research Institute, SK Telecom, and Upstage. Naver Cloud and NC Dinos AI were eliminated. Originally, four teams were planned to be selected, but model independence became a crucial criterion, along with objective performance indicators. Only three teams advanced to the second round.

The Ministry of Science and ICT, in collaboration with the National IT Industry Promotion Agency (NIPA) and the Telecommunications Technology Association (TTA), announced the results of the first round of evaluations for the independent AI foundation model project on the 15th.

The evaluation targeted five elite teams: LG AI Research Institute, SK Telecom, Naver Cloud, Upstage, and NC Dinos AI. The first round of evaluations was conducted after numerous in-depth discussions between the Ministry, NIPA, and the elite teams, which combined benchmark evaluations, expert evaluations, and user reviews. The evaluation criteria included the "AI Frontier Index," which measures AI model performance, and the "AI Diffusion Index," which reflects factors such as practical applicability, cost-effectiveness, and the ripple effect on the domestic and international AI ecosystem.

In the benchmark evaluation, LG AI Research Institute achieved the highest score. In the NIA benchmark evaluation, SK Telecom and LG AI Research Institute each received 9.2 out of 10 points. In the global common benchmark evaluation, LG AI Research Institute achieved the highest score with 14.4 out of 20 points. In the global individual benchmark evaluation, Upstage and LG AI Research Institute received 10 out of 10 points. In the combined benchmark score, LG AI Research Institute also ranked first with 33.6 points.

The expert evaluation was conducted by an evaluation committee of 10 external AI experts from industry, academia, and research institutes through extensive, in-depth analysis. Based on the technical reports and model training log files submitted by each team, LG AI Research Institute evaluated their technological prowess, development process, and originality. As a result, LG AI Research Institute achieved the highest score with 31.6 out of 35 points.

In a user evaluation conducted by 49 professional users, including AI startup CEOs, LG AI Research received the highest score, scoring 25 out of 25 for practical applicability and inference cost efficiency.

Thus, based on a combination of benchmarks, expert evaluations, and user reviews, LG AI Research, SK Telecom, and Upstage were selected to advance to the second round. Naver Cloud ranked high in the overall score, but the critical factor was its failure to meet the criteria for an independent AI foundation model. The expert evaluators pointed out that Naver Cloud's model's reliance on foreign models limited its uniqueness. NC AI received relatively low ratings in user evaluations and other areas.

The industry believes that NC AI, despite its industry-optimized model, fell short in the overall competition, scoring lowest in practical usability. This suggests that, for a project aimed at securing a "global top" model, it failed to meet both technical completion and practicality requirements.

Naver Cloud's progress was hampered by the "From Scratch" controversy. From Scratch refers to whether the entire process, from model design to pre-training, was independently performed. Naver's published technical report, which used external open-source models without initializing weights for video and audio encoders, was a critical disqualification factor in the expert evaluation.

Meanwhile, the government originally planned to narrow down the teams to four through the first evaluation round, but the elimination of two consortia necessitated a revision to the plan. The government plans to select one team through an additional selection round to resume the Dokpamo project.

The government plans to open the door to both previously eliminated teams and new entrants, resulting in a total of four elite teams. Additional teams selected will receive various benefits, including GPU and data support, and the designation of "K-AI Company."

The Ministry of Science and ICT also plans to further refine the independence guidelines during the second evaluation round and the subsequent open call process. While the overall evaluation method and criteria will remain in place, the plan is to gather input from academia and industry experts to develop more specific differential scoring criteria for From Scratch.

Ryu Je-myung, Second Vice Minister of Science and ICT, stated, "Based on lessons learned from this first phase of evaluation, we will minimize uncertainty from the outset when entering Phase 2." He reiterated, "This project represents a historic challenge for Korea to confidently compete in the global AI competition with its own technology. We will continue our technological innovation competition to develop world-class models, focusing all available national capabilities and resources to ensure Korea remains at the forefront of the global AI technology race."

Meanwhile, this project was launched to reduce reliance on global AI models and secure an independent AI foundation model capable of performing the entire process, from design to pre-training. From the competition stage, the Ministry of Science and ICT has clearly defined domestically produced models free of licensing issues as independent AI, rather than simply fine-tuned derivatives of foreign models.

On the technical side, the core requirements for independence include designing a unique model architecture, acquiring and processing large amounts of data in-house, and implementing proprietary learning algorithms to perform full-scale learning. Policy-wise, the goal is to secure AI capabilities with sovereignty and control to reduce the risks associated with reliance on foreign AI in defense, diplomacy, security, and national infrastructure. Ethically, the goal is to ensure license compliance and transparency when utilizing open source software.