LDC at ICASSP 2023
New publications:
2019 NIST Speaker Recognition Evaluation Test Set – CTS Challenge
LORELEI Zulu Representative Language Pack
_____________________________________________________________
LDC at ICASSP 2023
LDC will be exhibiting at ICASSP 2023, held this year June 4-10 in Rhodes, Greece. Stop by booth 15 to learn more about recent developments at the Consortium and the latest publications.
LDC will post conference updates via Twitter and Facebook. We look forward to seeing you there!
New publications:
2019 NIST Speaker Recognition Evaluation Test Set – CTS Challenge, developed by LDC and NIST, contains 635 hours of Tunisian Arabic telephone recordings for development and test, answer keys, enrollment, trial files and documentation from the CTS Challenge portion of the NIST-sponsored 2019 Speaker Recognition Evaluation. The 2019 evaluation was conducted in two parts: (1) a leaderboard-style challenge based on conversational telephone speech from LDC's Call My Net 2 (CMN2) corpus; and (2) a separate evaluation using audio-visual material collected by LDC for the VAST (Video Annotation for Speech Technology) project (released as LDC2023V01).
2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
*
LORELEI Zulu Representative Language Pack is comprised of over 5 million words of Zulu monolingual text, 2.7 million words of found Zulu-English parallel text, and 71,000 Zulu words translated from English data. Approximately 100,000 words were annotated for named entities and over 23,000 words were annotated for entity discovery and linking and situation frames (identifying entities, needs and issues). Data was collected from discussion forum, news, reference, social network, and weblogs.
The LORELEI (Low Resource Languages for Emergent Incidents) program was concerned with building human language technology for low resource languages in the context of emergent situations. Representative languages were selected to provide broad typological coverage.
The knowledge base for entity linking annotation is available separately as LORELEI Entity Detection and Linking Knowledge Base (LDC2020T10).
2023 members can access this corpus through their LDC accounts. Non-members may license this data for a fee.
No comments:
Post a Comment