Effective and Efficient Transfer Learning in the Era of Large Language Models

Vu, Tu

Publication

Effective and Efficient Transfer Learning in the Era of Large Language Models

Vu, Tu

Abstract

Substantial progress has been made in the field of natural language processing (NLP) due to the advent of large language models (LLMs)‚Äîdeep neural networks with millions or billions of parameters pre-trained on large amounts of unlabeled data. However, these models have common weaknesses, including degenerate performance in data-scarce scenarios, and substantial computational resource requirements. This thesis aims to develop methods to address these limitations for improved applicability and performance of LLMs in resource-constrained settings with limited data and/or computational resources. To address the need for labeled data in data-scarce scenarios, I present two methods, in Chapter 2 and Chapter 3, respectively. The first method leverages beneficial relationships between NLP tasks for transfer learning, while the second method combines data augmentation and self-training to boost few-shot learning performance‚Äîthe ability to perform novel tasks from only a few labeled examples. Additionally, in Chapter 4, I introduce a novel parameter-efficient transfer learning approach that reuses a single frozen model for all tasks while only learning minimal task-specific parameters (soft/continuous prompts) to represent tasks and transfer knowledge. Our method can match or outperform fine-tuning task-specific models (training the whole model on each task). In Chapter 5, I demonstrate the benefits of parameter-efficient transfer learning in a cross-lingual transfer setting. Finally, I conclude the thesis in Chapter 6 by outlining potential avenues for future research that aim to advance NLP through large-scale multi-task learning using multilingual and multimodal data.

Type

Dissertation (Open Access)

Date

2023-09

Degree

Doctor of Philosophy (PhD)

Advisors

Mohit Iyyer
Subhransu Maji
Hamed Zamani
Thang Luong
Colin Raffel

License

Attribution 4.0 International

cb

License

http://creativecommons.org/licenses/by/4.0/

Effective and Efficient Transfer Learning in the Era of Large Language Models

Vu, Tu

Citations

Abstract

Type

Date

Publisher

Degree

Advisors

License

License

Files

Research Projects

Organizational Units

Journal Issue

Embargo Lift Date

URI

DOI

Publisher Version

Embedded videos

Collections

Related Item(s)