Click here to Skip to main content
15,115,146 members
Please Sign up or sign in to vote.
0.00/5 (No votes)
**Note:** I have extracted the frame for all videos and save it in the folder with the same name of video

**train_data, class, video** ---> These are folders

**img** --> these are jpg files, so each class have many videos, I extracted the image for each video and save it to the folder with the name of video from which the frames are extracted.

**Directory of my dataset is something like this;**

---train_data
       ----class1
            -----video11
                   ----img111.jpg
                   ----img112.jpg
                       ...
                   ----img11n.jpg
                       
            -----video12
		   ----img121.jpg
                   ----img122.jpg
		       ...
                   ----img12n.jpg
                 ...
            -----video1m
		   ----img1m1.jpg
                   ----img1m2.jpg
		       ...
                   ----img1mn.jpg
       ----class2
            -----video21
                   ----img211.jpg
                   ----img212.jpg
                       ...
                   ----img21n.jpg
                       
            -----video22
		   ----img221.jpg
                   ----img222.jpg
		       ...
                   ----img22n.jpg
                 ...
            -----video2m
		   ----img2m1.jpg
                   ----img2m2.jpg
		       ...
                   ----img2mn.jpg

           ....

Total extracted images for each video = 28

Total classes are = 101

Total Videos are = 10619

Total Images are = 301169

Temporal Length = 16

Temporal stride = 4

**for each video** ==> It will read first 16 images, then it will leave next 4 image and read from 5th one till 20, by leaving next four image it will again read from 9th image to 24, and then in last till 28 for each video. 

Total number of sample will be 4 for each video ==> [16, 20, 24, 28] (28 extracted frames)

each sample contain 16 frames with 112x112x3 shape size.

total number of sample for all classes = num_sample_for_each_video * total videos = 4 * 10619 = 42142 (approximately because the sample can be 3 in certain videos)

This way, it will form the shape for training data ==> [42142, 16, 112, 112, 3]
          it will form the shape for label data ==> [42142, ]


Can Anyone tell me how can I load it on DataLoader in PyTorch? 

  [1]: https://i.stack.imgur.com/EtZRf.png


What I have tried:

Tried my code, everything is explained above.
Posted
Updated 6-Aug-21 6:37am

1 solution

Check the documentation: PyTorch[^].
   

This content, along with any associated source code and files, is licensed under The Code Project Open License (CPOL)




CodeProject, 20 Bay Street, 11th Floor Toronto, Ontario, Canada M5J 2N8 +1 (416) 849-8900