준나이의 블로그

Locating and Editing Factual Associations in GPT (2023) 논문 리뷰 - 1

Abstract autoregressive transformer language model(LM)에서 factual associations(사실 관계)를 저장하는 장소와 이를 기억해내는 능력을 분석 causal intervention(= causal tracing): LM 내 어떤 module(= neural activations, layers, NN)이 model이 사실관계를 예측할때 결정적인 역할을 하는가?를 분석하기 위한 실험 결과를 통해 model의 중간 layer 내에 있는 feed-forward layers(MLP)가 subject tokens를 처리 할 때 하는 연산(computations)이 target(=object)를 prediction 할 때 영향을 끼치는 것을 밝혀 냄 이 compu..

Data Science/Paper Review 2023.05.14

CONTROL PREFIXES for Parameter-Efficient Text Generation (2021) 논문리뷰

Abstract prefix-tuning large pre-trained model(PLM)을 downstream task로 adaptation 시키기위한 가볍지만 강력한 기술 하지만 같은 level의 dataset으로 학습된 prompt를 모든 examples에 사용함 control-prefixes prefix-tuning을 확장해서, input-dependent한 information을 추가로 포함시켜 dynamic prompt화 시킨 기술 prompt-tuning과 controlled generation의 이점들을 모두 이용할 수 있게됨 attribute-level representation을 PML의 layers와 통합시키고 text를 원하는 방향으로 생성 수 있도록 guide할 수 있음 GEM ..

Data Science/Paper Review 2023.05.13

[P-tuning] GPT Understands, Too (2021) 논문리뷰

Abstarct Problem: 전통적인 fine-tuning이 적용된 GTPs는 NLU tasks에서 좋은 performance를 보여주지 못함 P-tuning: 학습가능한 trainable continuous prompt embedding을 도입하여, NLU tasks에서 BERT 계열 모델들에 상응하거나 이를 상회하는 성능을 이끌어냄 benchmarks: LAMA (knowlege probing) benchmar, SuperGLUE 1. Introduction pre-trained language model(PLM)의 한계 (특히 GPT) LM을 pre-training 하는 것은 많은 tasks를 통해 성능이 증명됨 pre-training을 통해 LM은 contextualised text repre..

Data Science/Paper Review 2023.05.13

Prefix-Tuning: Optimizing Continuous Prompts for Generation (2021) 논문 리뷰

Abstract Fine-tuining Fine-tuining은 large pre-trained language models(LM)로 downstream tasks를 수행하기 위해 널리 사용되는 방법 하지만 LM의 parameters를 모두 update해야해서 각 task마다 LM의 full copy를 필요로 함 Prefix-tuning fine-tuning 보다 더 가벼운 대안으로 LM의 parameters는 freezing 시킨 채, prefix라고 불리는 작은 continuous task-specific vector를 최적화하는 방법 prompting에서 영감을 받았으며, prefix는 virtual tokens 처럼 동작하며 prefix 뒤에 나오는 tokens에 영향을 줌 Experiment t..

Data Science/Paper Review 2023.05.07

AdapterHub: A Framework for Adapting Transformers (2020)

Abstract 기존 fine-tuining 방식의 한계 현재 NLP 분야의 작업 방식은 수백만에서 수십억에 이르는 paramters를 갖고 이쓴ㄴ pre-trained model을 downloading and fine-tuning 하는 과정이 포함됨 이렇게 규모가 큰 모델을 저장하고 공유하는 것은 대체로 느리고, 비싸고, 그리고 많은 시간이 필요해서 project의 진전을 저해시키다 기존 adapters의 한계 adapter: 적은 수의 parameters로 이루어진 학습된 NN 모델 전체를 fine-tuning 해야하는 수고로움을 덜어주지만, adapter를 공유하고 결합하는 방식은 그리 직관적이지 않음 AdapterHub 다양한 tasks와 languages를 위해 학습된 adapter를 동적으로 ..

Data Science/Paper Review 2023.05.03

K-ADAPTER: Infusing Knowledge into Pre-Trained Models with Adapters (2020)

Abstract problem: 기존에는 pre-trained model에 새로운 knowledge를 주입(inject)하기 위해 pre-trained model 자체를 fine-tuning 했는데, 각기 다른 많은 knowledge가 한번에 주입될 경우, 모델이 갖고 있던 정보가 사라지는 문제 발생 K-Adapter: pre-trained model의 parameters는 freeze 시킨 상태에서, versatile knowledge-infused model 개발 주입하려는 knowledge의 수만 큼 NN 기반의 adpater가 plug-in 처럼 존재, adapter 간에 공유되는 information이 존재하지 않아서, continual learning 가능 case study: injected..

Data Science/Paper Review 2023.05.02

BERT and PALs: Projected attention layers for efficient adaptation in multi-task learning (2019) 논문리뷰

Abstract Multi-task learning 시, task 간 information을 공유하는 것은 일반적인 방법이고, 이 때 필요한 parameter 수를 줄이는 것은 중요 기존에는 각 task 마다 모델을 별도로 fine-tuning 해야해서 $I$개의 task가 있으면 $I$개의 모델을 별도로 필요로 함 이 논문에서는 적은 수의 parameter를 이용해서 하나의 모델로 다양한 task를 수행할 수 있는 방식을 소개 1. Introduction Adaptation을 위한 기존 연구 Pre-trained 모델의 모든 parameters를 share 하는 방식 하지만 input + output shape이 동일해야함 논문에서 제안하는 방식 대부분의 parameters share (generali..

Data Science/Paper Review 2023.05.01

matplotlib 에서 latex 쓰기

import numpy as np import numpy.random as rnd import scipy.stats import scipy.special import matplotlib import matplotlib.pyplot as plt # 다음과 같이 선언 matplotlib.rcParams['text.latex.preamble']=[r"\usepackage{amsmath}"] matplotlib.rc('text', usetex=True) # 다음과 같이 사용 eps_std = 0.63 plotv = np.linspace(-2, 3, 500) p_v_c_s = np.vstack([scipy.stats.norm.pdf(plotv, loc=l, scale=eps_std) for l in [0, 1]]..

IT/Python 2022.11.19

np.unravel_index() 설명 및 사용예시 정리

Function Signature numpy.unravel_index(indices, shape, order='C') shape = (M, N) 으로 주어질 때, Matrix (M, N) 내 원소는 0 부터 M * N-1 까지 존재한다고 가정 indicies = index: int (or indicies: list of int)가 위치한 좌표를 나타냄 예시1 : np.unravel_index(6, (3, 4)) shape: (3, 4) -> np.arange(12).reshape(3, 4) = 0~11 로 채워진 M = 3, N =4인 matrix 0 1 2 (return j) 3 0 0 1 2 3 1 (return i) 4 5 6 (index) 7 2 8 9 10 11 (1, 2) 예시2 : np.ar..

IT/Python 2022.11.15

Elastic Search DSL search query 기초

index 정보 호출 # es 내에 설치된 모든 index(=table) 정보 호출 GET _cat/indicesmapping 정보 호출 # index의 mapping(=schema) 정보 호출 GET dev_recommend_storage_cst_mart/_searchsearch query - match_all # index 내 모든 documents return GET dev_recommend_storage_prd_mart/_search { "query": { "match_all": {} } } # GET dev_recommend_storage_prd_mart/_search 와 동일search query - term (query vs. filter) # score가 반환되는 일반 query # gen..

IT/Elasticsearch 2021.04.19

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

준나이의 블로그

전체 글 37

티스토리툴바