MSLID-TCN: multi-stage linear-index dilated temporal convolutional network for temporal action segmentation
| dc.authorid | Toktas, Abdurrahim/0000-0002-7687-9061 | |
| dc.contributor.author | Gao, Suo | |
| dc.contributor.author | Wu, Rui | |
| dc.contributor.author | Liu, Songbo | |
| dc.contributor.author | Erkan, Uğur | |
| dc.contributor.author | Toktaş, Abdurrahim | |
| dc.contributor.author | Liu, Jiafeng | |
| dc.date.accessioned | 2025-01-12T17:19:40Z | |
| dc.date.available | 2025-01-12T17:19:40Z | |
| dc.date.issued | 2024 | |
| dc.department | Karamanoğlu Mehmetbey Üniversitesi | |
| dc.description.abstract | Temporal Convolutional Network (TCN) has received extensive attention in the field of speech synthesis. Many researchers use TCN-based models for action segmentation since both tasks focus on contextual connections. However, TCN can only capture the long-term dependencies of information and ignores the short-term dependencies, which can lead to over-segmentation by dividing a single action interval into multiple action categories. This paper proposes Multi-Stage Linear-Index Dilated TCN (MSLID-TCN) model each of whic layer has an appropriate receptive field, allowing the video's short-term and long-term dependencies to be passed to the next layer, thereby optimizing the over-segmentation problem. MSLID-TCN has a four-stage structure. The first stage is a LID-TCN, while the remaining stages are Single Stage TCNs (SS-TCNs). The I3D feature of the video is used as the input for MSLID-TCN. In the first stage, LID-TCN makes initial predictions on frame features to obtain predicted probability values. In the last three stages, these probability features are used as input to the network where SS-TCN corrects the predicted probability values from the previous stage, ultimately yielding action segmentation results. Comparative experiments show that our model performs excellently on the three datasets: 50salads, Georgia Tech Egocentric Activities (GTEA), and Breakfast. | |
| dc.description.sponsorship | National Natural Science Foundation of China [61672190]; National Natural Science Foundation of China [CSC202306120290]; China Scholarship Council (CSC) [2023YFC3305003]; Sub-project of National Key Research and Development Program of China | |
| dc.description.sponsorship | This research is supported by the National Natural Science Foundation of China, No. 61672190; China Scholarship Council (CSC), No. CSC202306120290; the Sub-project of National Key Research and Development Program of China (No. 2023YFC3305003). | |
| dc.identifier.citation | Gao, S., Wu, R., Liu, S., Erkan, U., Toktas, A., Liu, J., & Tang, X. (2024). MSLID-TCN: multi-stage linear-index dilated temporal convolutional network for temporal action segmentation. International Journal of Machine Learning and Cybernetics, 16(1), 567–581. https://doi.org/10.1007/s13042-024-02251-y | |
| dc.identifier.doi | 10.1007/s13042-024-02251-y | |
| dc.identifier.issn | 1868-8071 | |
| dc.identifier.issn | 1868-808X | |
| dc.identifier.scopus | 2-s2.0-85196320787 | |
| dc.identifier.scopusquality | Q1 | |
| dc.identifier.uri | https://doi.org/10.1007/s13042-024-02251-y | |
| dc.identifier.uri | https://hdl.handle.net/11492/10143 | |
| dc.identifier.wos | WOS:001249630500002 | |
| dc.identifier.wosquality | N/A | |
| dc.indekslendigikaynak | Web of Sceince | |
| dc.indekslendigikaynak | Scopus | |
| dc.institutionauthor | Erkan, Uğur | |
| dc.institutionauthorid | Erkan, Uğur/0000-0002-2481-0230 | |
| dc.language.iso | en | |
| dc.publisher | Springer Heidelberg | |
| dc.relation.ispartof | International Journal of Machine Learning and Cybernetics | |
| dc.relation.publicationcategory | Makale - Uluslararası Hakemli Dergi - Kurum Öğretim Elemanı | |
| dc.rights | info:eu-repo/semantics/closedAccess | |
| dc.snmz | KA_20250111 | |
| dc.subject | Temporal action segmentation | |
| dc.subject | Temporal convolutional network | |
| dc.subject | Multi-stage temporal convolutional | |
| dc.subject | Deep learning | |
| dc.title | MSLID-TCN: multi-stage linear-index dilated temporal convolutional network for temporal action segmentation | |
| dc.type | Article |
Dosyalar
Orijinal paket
1 - 1 / 1












