728x90
Video raw data을 dataset에서 읽어올 때 문제가 되는 상황 (Memory Leakage)
https://github.com/pytorch/pytorch/issues/13246
DataLoader num_workers > 0 causes CPU memory from parent process to be replicated in all worker processes · Issue #13246 · pyt
Editor note: There is a known workaround further down on this issue, which is to NOT use Python lists, but instead using something else, e.g., torch.tensor directly. See #13246 (comment) . You can ...
github.com
애초에 Python list, dict 등의 자료구조 자체가 문제가 있기에 버그로 보기엔 어려운 상황
대응방법 1.
Multiprocessing manager (AVT가 사용한 방법)
from torch.utils.data import Dataset, DataLoader
import torch
from multiprocessing import Manager
class DataIter(Dataset):
def __init__(self):
manager = Manager()
self.data = manager.list([x for x in range(24000000)])
def __len__(self):
return len(self.data)
def __getitem__(self, idx):
data = self.data[idx]
return torch.tensor(data)
train_data = DataIter()
train_loader = DataLoader(train_data, batch_size=300,
shuffle=True,
drop_last=True,
pin_memory=False,
num_workers=18)
for i, item in enumerate(train_loader):
if i % 1000 == 0:
print(i)
대응 방법 2
List의 매 object마다 주소가 누적되어 메모리가 터지는 것이므로 Deep copy사용
def __getitem__(self,idx):
seg_tag = deepcopy(self.segment_tags[idx]) # str
traj_info = deepcopy(self.traj_infos[seg_tag]) # dict
traj_feat = deepcopy(self.traj_features[seg_tag]) # (n_traj,2048)
traj_embd = deepcopy(self.traj_embeddings[seg_tag]) # (n_traj,256)
return seg_tag, traj_info,traj_feat,traj_embd
728x90
'Pytorch' 카테고리의 다른 글
Huggingface "UserWarning: Creating a tensor from a list of numpy.ndarrays is extremely slow. Please consider converting the list to a single numpy.ndarray with numpy.array() before converting to a tensor." 처리법 (0) | 2024.03.13 |
---|---|
HF Trainer 사용 시 collator에 아무것도 들어오지 않는 경우 (0) | 2024.03.13 |
warm up (0) | 2023.08.23 |
Concat용 빈 tensor 사이즈만 만들어놓기 (0) | 2023.02.21 |
nn.ModuleDict() (0) | 2022.11.15 |