Have questions about using Cloud Clip Timeline? What makes him call TQL ?

Mondo Social Updated on 2024-02-04

This article is the 6th issue of the Alibaba Cloud Intelligent Service IMS Cloud Intelligent Editing Practice Guide, starting from the customer's real practice scenarios, sharing some Timeline tips (AI TTS, master track, material alignment) to help customers reduce development time and costs.
Uncle OuAuthorThe story starts with a real piece of feedback from a customer.

One day, a customer joined the intelligent ** service Q&A group and wanted to achieve a short ** effect, and the following conversation occurred:

Link to the figure above: In Alibaba Cloud's intelligent service IMS cloud clipping, customers usually edit a timeline to submit the editing task and synthesize the one they want. There is an AI TTS function in Timeline, which is convenient for customers to match oral explanations for their own **.

This function is very common in short synthesis, such as the need to match a store visit with an advertising word, or the appearance of a product with a product introduction. When using, the customer only needs:Add a clip to the audio track and set the text content and tone of the spoken broadcastThat's it. In the actual synthesis,The engine will first do the speech synthesis, and then synthesize the speech synthesis results into the finished filmThe customer only needs to call the editing task once during the whole process.

However, in actual use, there will still be some effect problems, such as the problems encountered by the customers above:

Before synthesis, the customer does not know the time that the oral copy needs to be read, and when collocation, the duration of the **track cannot be well controlled:

The final synthesized film may have a black frame at the end, the end of the oral broadcast but **is still broadcasting, this is the badcase effect synthesized by the customer, the oral broadcast is over, **is still continuing**.

Cloud Shear Guide Issue 6 Example **1:

Timeline Example:

Main tracksThe function is to solve this problem when the customer sets a track as the main trackThe other tracks in the timeline are truncated according to the main trackIn the above example, the customer can set the oral broadcast track as the main track, fill in the ** track material long enough, and finally truncate it according to the main track, there will be no black frames or misaligned endings.

The effect of the main track is used:

Cloud Shear Guide Issue 6 Example **2:

Timeline example (note the maintrack=true parameter):

The next day, the customer came back again.

Link to the figure above: In many practical scenarios, customers need to know the precise duration of speech synthesis, so as to better control it in the actual business, such as: control the ** and stickers corresponding to each sentence of copy. Customers can first call the intelligent voice interface, combine the oral broadcast and subtitles first, and then match the oral broadcast with materials according to the duration of each sentence.

In this way, the whole compositing process has changed from one step to two steps, and you need to spell the subtitle timeline by yourself, which will be more complicated than using AI TTS directly, but the customer control can be more flexible, which is also very common in real customer scenarios.

The whole process implementation process is as follows:

A few days later, the customer found the technical brother again.

Link to the picture above: This time, the customer's synthesis process is like this, which is not the same as the previous suggestion of the technical brother.

The problem that the customer encountered this time wasThe timestamp of the sentence in the first speech synthesis is inconsistent with the result of AI TTS synthesis in the future. Due to some underlying reasons, even if the speech synthesis parameters are the same, the result of each synthesis will have millisecond deviationsIt is not recommended to use the results of the previous time as a reference for the next composition. At the same time,Both the intelligent voice task and AI TTS will actually do speech synthesis, and the cost is also accounted for twice, which is not cost-effective in terms of cost and efficiency.

This time, the customer's scene is to align the ** material with the oral copy sentence by sentence to make ** more rhythmic.

When you have this simple need for material alignment between different tracks, you can use the material alignment function directly. In the timeline, you can set the ID (ClipID) for each footage, and you can also set the ReferenceClipID (ReferenceClipID) for the footage, such as the following configuration, you can achieve the effect that the customer wants, and the customer can only submit the editing task once.

The final result is as follows:

Cloud Shear Guide Issue 6 Example **3:

Timeline Example:

Ten minutes later.

After several conversations with customers, we went live with some more features.

According to the overall speed of the main track:The problem with "truncating other tracks according to the master track" is that the last clip may be incomplete, and if the last clip happens to be truncated for tens of milliseconds, the last frame will flash by, and the experience will be degraded. When the total duration of the track is different from the duration of the oral broadcast track, but the customer wants to complete the audio material, you can use the function of double the overall speed of the track to double the speed of the track as a whole and align it with the end of the oral broadcast track.

Unilateral Alignment:In some product introduction scenarios, the oral broadcast is often shorter than the corresponding ** material, and the customer expects that after the oral broadcast is over, the corresponding material will be finished, and then the next product introduction will begin, in this scenario, you can use the ability of unilateral alignment.

For specific use, please refer to the release record of the intelligent ** service function:

Finally, the technical brother wants to say: there is a trick in the editingButt-cutorStraight-cut, i.e. the picture and sound start and end at the same point in time. If handled properly, it will be very much in line with the audience's taste, and on the contrary, it will greatly affect the ** experience.

The customer's scene mentioned above is very common in the short **, which not only avoids anomalies such as black frames, but also makes the finished film more rhythmic in terms of look and sound, and it is also very recommended that customers who use short** synthesis can use it. AI TTS, master tracks, material alignment and other functions are all continuously polished based on a large number of customer feedback and real-world scenariosOn the premise of ensuring the effect, it can greatly save the customer's development time, so that the customer can devote more energy to the business.

Many technical students come into contact with cloud editing for the first time, and they have no editing foundation, and they will encounter all kinds of pitfalls, the cloud editing guide will introduce more tips for using cloud editing timeline, and it is more convenient to use cloud editing timeline in combination with customers' real scenes.

IMS Cloud Smart Editing is a cutting-edge production service based on cloud computing and artificial intelligence technology, which can provide users with core functions such as live editing, editing, template factory, and digital human production, and can use AI to assist editing production. The product can be widely used in the Internet, cultural media, advertising and marketing, education and finance and other industries to meet the needs of enterprises for large-scale, efficient, convenient and intelligent content production.

Welcome to join the official Q&ADingTalk groupConsultation and exchange: 48335001108

Related Pages