|
||||||||||||||||||||
CHANGE REQUEST |
||||||||||||||||||||
|
||||||||||||||||||||
|
DASH-IF IOP |
CR |
|
rev |
- |
Current
version: |
4.3 |
|
||||||||||||
|
||||||||||||||||||||
Status: |
|
Draft |
X |
Internal Review |
|
Community Review |
|
Agreed |
||||||||||||
|
||||||||||||||||||||
|
||||||||||||||||||||
Title: |
Advanced Ad Insertion in DASH |
|||||||||||||||||||
|
|
|||||||||||||||||||
Source: |
Ad Insertion TF |
|||||||||||||||||||
|
|
|||||||||||||||||||
Supporting
Companies: |
Hulu, Qualcomm, Tencent, Unified
Streaming, <others to
be added> |
|||||||||||||||||||
|
|
|||||||||||||||||||
Category: |
A |
|
Date: |
2019-08-12 |
||||||||||||||||
|
Use one of the following
categories: |
|
||||||||||||||||||
|
|
|||||||||||||||||||
Reason
for change: |
Ad
Insertion is considered as one of the most important aspects in online video
distribution. Also with the development of CMAF, some additional aspects are
relevant, such as consistent development of Ad content, content insertion
into CMAF live content, etc. This document addresses latest development in
the context of Ad Insertion and maps this to DASH. |
|||||||||||||||||||
|
|
|||||||||||||||||||
Summary
of change: |
1) Description of most relevant use cases 2) Ad Insertion architecture 3) Definition of main content requirements
and recommendations 4) Definition of ad content requirements and
recommendations 5) Definition of combined main and ad
content 6) Ad specific metadata 7) Ad tracking |
|||||||||||||||||||
|
|
|||||||||||||||||||
Consequences
if not approved: |
Insufficient Ad Insertion capabilities in
DASH |
|||||||||||||||||||
|
|
|||||||||||||||||||
Sections
affected: |
References, The whole clause 8 on DASH Ad Insertion in DASH-IF
IOP is replaced |
|||||||||||||||||||
|
|
|||||||||||||||||||
Other
comments: |
|
|||||||||||||||||||
Disclaimer: |
This document is not yet final. It is provided for public review until the deadline mentioned below. If you have comments on the document, please submit comments by one of the following means: -
at the github repository https://github.com/Dash-Industry-Forum/AdInsertion/issues, or -
dashif+iop@groupspaces.com with a
subject tag [AdInsertion] Please add a
detailed description of the problem and the comment. Based on the received comments a final document will be published latest by the expected publication date below, integrated in a new version of DASH-IF IOP if the following additional criteria are fulfilled: -
All comments from community review
are addressed -
The relevant aspects for the
Conformance Software are provided -
Verified IOP test vectors are
provided |
|
|
Commenting
Deadline: |
September 30th,
2019 |
|
|
Expected
Publication: |
December 15th,
2019 |
Contributors: |
Zachary Cava
(Hulu) Thomas Stockhammer
(Qualcomm) Iraj Sodagar
(Tencent) Rufael Mekuria
(Unified Streaming) Andy Rosen
(DSR) Gary Hughes
(independent) Nicol So
(Arris) Will Law
(Akamai) Alex Giladi
(Comcast) Cooper Pope
(Turner) And others |
Notes:
1) If appropriate,
the references refer to specific versions of the specifications. However,
implementers are encouraged to check later versions of the same specification,
if available. Such versions may provide further clarifications and corrections.
However, new features added in new versions of specifications are not added
automatically.
2)
Specifications
not yet officially available are marked in italics.
3)
Specifications
considered informative only are marked in Arial
[1]
DASH-IF
DASH-264/AVC Interoperability Points, version 1.0, available at http://dashif.org/w/2013/06/DASH-AVC-264-base-v1.03.pdf
[3]
ISO/IEC
23009-1:2012/Cor.1:2013 Information technology -- Dynamic adaptive streaming
over HTTP (DASH) -- Part 1: Media presentation description and segment formats.
Note: this document is superseded by reference [4], but maintained
as the initial version of this document is provided in the above
reference.
[4]
ISO/IEC
23009-1:2014 Information technology -- Dynamic adaptive streaming over HTTP
(DASH) -- Part 1: Media presentation description and segment formats. Including:
·
ISO/IEC
23009-1:2014/Cor 1:2015
·
ISO/IEC
23009-1:2014/Cor 2:2017
·
ISO/IEC
23009-1:2014/Amd 1:2015 High Profile and Availability Time Synchronization
·
ISO/IEC
23009-1:2014/Amd 2:2015 Spatial relationship description, generalized URL
parameters and other extensions
·
ISO/IEC
23009-1:2014/Amd 3:2016 Authentication, MPD linking, Callback Event, Period
Continuity and other Extensions.
·
ISO/IEC
23009-1:2014/DAmd 4:2016 Segment Independent SAP Signalling (SISSI), MPD
chaining, MPD reset and other extensions.
All the above is
expected to be rolled into a third edition of ISO/IEC 23009-1 as:
·
ISO/IEC
23009-1:2018 Information technology -- Dynamic adaptive streaming over HTTP
(DASH) -- Part 1: Media presentation description and segment formats. [Note:
Expected to be published by end of 2018. The draft third edition is available
in the MPEG document m44441.]
In addition, the
following documents are under preparation in MPEG:
·
ISO/IEC
23009-1:2014/DCor 3:2018 [Note: Expected to be published by mid of 2019. The
study of the COR is available as an output document w17951.]
·
ISO/IEC
23009-1:2014/DAmd 5:2018 Device Information and other extensions. 2018 [Note:
Expected to be published by mid of 2019. The DAM is available as an output
document w18057.]
[5]
ISO/IEC 23009-2:2014: Information technology -- Dynamic
adaptive streaming over HTTP (DASH) -- Part 2: Conformance and Reference.
[7]
ISO/IEC
14496-12:2015 Information technology -- Coding of audio-visual objects -- Part
12: ISO base media file format. This also includes
amendments and corrigendas, for details see here:
https://www.iso.org/standard/68960.html
[10]
IETF
RFC 6381, The 'Codecs' and 'Profiles' Parameters for "Bucket" Media
Types, August 2011.
[11]
ISO/IEC
14496-3:2009 - Information technology -- Coding of audio-visual objects -- Part
3: Audio
with Corrigendum 1:2009, Corrigendum 2:2011, Corrigendum 3:2012, Amendment
1:2009, Amendment 2:2010, Amendment 3:2012, and Amendment 4:2014.
[14]
ANSI/CEA-708-E:
Digital Television (DTV) Closed Captioning, August 2013
[16]
W3C
Timed Text Markup Language 1 (TTML1)
(Second Edition) 24 September 2013.
[17]
SMPTE
ST 2052-1:2013 "Timed Text Format (SMPTE-TT)",
https://www.smpte.org/standards
[18]
W3C WebVTT - The Web Video
Text Tracks,— http://dev.w3.org/html5/webvtt/
[19]
ITU-T
Recommendation H.265 (02/2018):
"Advanced video coding for generic audiovisual services" | ISO/IEC
23008-2:2015/Amd 1:2015: " High Efficiency Coding and Media Delivery in
Heterogeneous Environments – Part 2: High Efficiency Video Coding",
downloadable here: http://www.itu.int/rec/T-REC-H.265
[21]
IETF
RFC 7230, Hypertext Transfer Protocol (HTTP/1.1): Message Syntax and Routing,
June 2014.
[22]
IETF
RFC 7231, Hypertext Transfer Protocol (HTTP/1.1): Semantics and Content, June
2014.
[23]
IETF
RFC 7232, Hypertext Transfer Protocol (HTTP/1.1): Conditional Requests, June
2014.
[24]
IETF
RFC 7233, Hypertext Transfer Protocol (HTTP/1.1): Range Requests, June 2014.
[25]
IETF
RFC 7234, Hypertext Transfer Protocol (HTTP/1.1): Caching, June 2014.
[26]
IETF
RFC 7235, Hypertext Transfer Protocol (HTTP/1.1): Authentication, June 2014.
[27]
SMPTE RP 2052-10-2013: Conversion from CEA-608 Data to
SMPTE-TT https://www.smpte.org/standards
[28]
SMPTE RP 2052-11-2013: Conversion from CEA 708 to
SMPTE-TT https://www.smpte.org/standards
[29]
ISO/IEC
14496-30:2014, "Timed Text and Other Visual Overlays in ISO Base Media
File Format". Including:
ISO/IEC
14496-30:2014, Cor 1:2015
ISO/IEC
14496-30:2014, Cor 2:2016
[31]
DASH
Industry Forum, Test Cases and Test Vectors: http://testassets.dashif.org/.
[33]
DASH
Identifiers Repository, available here: http://dashif.org/identifiers
[34]
DTS
9302J81100, “Implementation of DTS Audio in Media Files Based on ISO/IEC 14496”, http://www.dts.com/professionals/resources/resource-center.aspx
[35]
ETSI
TS 102 366 v1.2.1, Digital Audio Compression (AC-3, Enhanced AC-3) Standard
(2008-08)
[36]
MLP
(Dolby TrueHD) streams within the ISO Base Media File Format, version 1.0,
September 2009.
[39]
DTS
9302K62400, “Implementation of DTS Audio in Dynamic Adaptive Streaming over
HTTP (DASH)”, http://www.dts.com/professionals/resources/resource-center.aspx
[41]
IETF
RFC 6265: "HTTP State Management Mechanism", April
2011.
[42]
ETSI
TS 103 285 v.1.1.1: "MPEG-DASH Profile for Transport of ISO BMFF Based DVB
Services over IP Based Networks".
[43]
ANSI/SCTE 128-1 2013: "AVC Video Constraints for Cable Television,
Part 1 - Coding", available here:
http://www.scte.org/documents/pdf/Standards/ANSI_SCTE%20128-1%202013.pdf
[44]
IETF RFC 2119, "Key words for use in RFCs to Indicate
Requirement Levels", April 1997.
[45]
ISO:
“ISO 639.2, Code for the Representation of Names of Languages — Part 2: alpha-3
code,” as maintained by the ISO 639/Joint Advisory Committee (ISO 639/JAC), http://www.loc.gov/standards/iso639-2/iso639jac.html;
JAC home page: http://www.loc.gov/standards/iso639-2/iso639jac.html; ISO 639.2
standard online: http://www.loc.gov/standards/iso639-2/langhome.html.
[46]
CEA-608-E,
Line 21 Data Service, March 2008.
[47]
IETF
RFC 5234, “Augmented BNF for Syntax Specifications: ABNF”, January 2008.
[48]
SMPTE
ST 2086:2014, “Mastering Display Color Volume Metadata Supporting High
Luminance And Wide Color Gamut Images”
[50]
IETF
RFC 7164, “RTP and Leap Seconds”, March 2014.
[52]
IAB
Video Multiple Ad Playlist (VMAP), available at https://www.iab.com/guidelines/digital-video-multiple-ad-playlist-vmap-1-0-1/
[53]
IAB
Video Ad Serving Template (VAST), available at https://www.iab.com/guidelines/digital-video-ad-serving-template-vast/
[54]
ANSI/SCTE
35 2015, Digital Program Insertion Cueing Message for Cable
[56]
ANSI/SCTE
214-1, MPEG DASH for IP-Based Cable Services, Part 1: MPD Constraints and Extensions
[57]
ANSI/SCTE
214-3, MPEG DASH for IP-Based Cable Services, Part 3: DASH/FF Profile
[59]
Common
Metadata, TR-META-CM, ver. 2.0, January 3, 2013, available at http://www.movielabs.com/md/md/v2.0/Common_Metadata_v2.0.pdf
[60]
IETF
RFC 4648, "The Base16, Base32, and Base64 Data Encodings", October
2006.
[61]
W3C
TTML Profiles for Internet Media Subtitles and Captions 1.0 (IMSC1), Editor’s
Draft 03 August 2015, available at: https://dvcs.w3.org/hg/ttml/raw-file/tip/ttml-ww-profiles/ttml-ww-profiles.html
[62]
W3C
TTML Profile Registry, available at: https://www.w3.org/wiki/TTML/CodecsRegistry
[63]
ETSI
TS 103 190-1 v1.2.1, “Digital Audio Compression (AC-4); Part 1: Channel based
coding”.
[64]
ISO/IEC
23008-3:2018, Information technology -- High efficiency coding and media
delivery in heterogeneous environments -- Part 3: 3D audio.
[65]
IETF
RFC 5246, “The Transport Layer Security (TLS) Protocol, Version 1.2”, August
2008.
[66]
IETF
RFC 4337, “MIME Type Registration for MPEG-4”, March 2006.
[69]
W3C
Encrypted Media Extensions - https://www.w3.org/TR/encrypted-media/.
[72]
ISO/IEC
23001-8:2013, “Information technology -- MPEG systems technologies -- Part 8:
Coding-independent code points”, available here: http://standards.iso.org/ittf/PubliclyAvailableStandards/c062088_ISO_IEC_23001-8_2013.zip
[75]
ETSI
TS 101 154 v2.2.1 (06/2015): "Specification for the use of Video and Audio
Coding in Broadcasting Applications based on the MPEG-2 Transport Stream."
[76]
ETSI TS 103 285 v1.1.1 (05/2015): "Digital
Video Broadcasting (DVB); MPEG-DASH Profile for Transport of ISO BMFF Based DVB
Services over IP Based Networks.”
[77]
3GPP
TS 26.116 (03/2016): "Television (TV) over 3GPP services; Video Profiles.”
[78]
DECE
(05/2015): “Common File Format & Media Formats Specification”, http://uvcentral.com/sites/default/files/files/PublicSpecs/CFFMediaFormat-2_2.pdf
[79]
Ultra
HD Forum: Phase A Guidelines, version 1.1, July 2015
[81]
SMPTE ST 2086:2014, “Mastering Display Color
Volume Metadata Supporting High Luminance And Wide Color Gamut Images”
[82]
SMPTE ST 2094-1:2016, “Dynamic Metadata for
Color Volume Transform – Core Components”
[83]
SMPTE ST 2094-10:2016,
“Dynamic Metadata for Color Volume Transform – Application #1”
[84]
Recommendation ITU-R BT.1886: “Reference
electro-optical transfer function for flat panel displays used in HDTV studio
production”
[85]
ETSI
DGS/CCM-001 GS CCM 001 “Compound Content Management”
[86]
VP9 Bitstream &
Decoding Process Specification. https://storage.googleapis.com/downloads.webmproject.org/docs/vp9/vp9-bitstream-specification-v0.6-20160331-draft.pdf
[87]
VP
Codec ISO Media File Format Binding https://www.webmproject.org/vp9/mp4/
[88]
ETSI
TS 143 433-1, “High-Performance Single Layer High
Dynamic Range (HDR) System for use in Consumer Electronics devices; Part 1: Directly
Standard Dynamic Range (SDR) Compatible HDR System (SL-HDR1)”
[92]
DASH-IF
IOP: specification of live ingest, May, 2019, https://dashif-documents.azurewebsites.net/Ingest/master/DASH-IF-Ingest.html
[93]
CableLabs
Video-On-Demand Content Specification, Version 1.1, available at https://specification-search.cablelabs.com/cablelabs-video-on-demand-content-specification-version-1-1
[94]
ISO/IEC
1318-1, MPEG-2 Part 1, Systems
[95]
IAB
Lab Tech Open Measurement SDK, available at https://iabtechlab.com/standards/open-measurement-sdk/
[96]
W3C
Media Source Extensions, available at https://www.w3.org/TR/media-source/
[97]
W3C
Encrypted Media Extensions, available at https://www.w3.org/TR/encrypted-media/
[98]
Consumer Technology Association Web
Application Video Ecosystem (CTA Wave), https://cta.tech/Research-Standards/Standards-Documents/WAVE-Project/WAVE-Project.aspx
[99]
ISO/IEC 23000-19:2020, "Common
Media Application Format", Second Edition FDIS is available as MPEG Output
w18636.
[100]
ISO/IEC 23009-1:2020, "Dynamic
Adaptive Streaming over HTTP, Media presentation description and segment
formats", Fourth Edition FDIS is available as MPEG Output w18609.
[101]
ISO/IEC 23009-1:2020/Amd.1:2020,
"Dynamic Adaptive Streaming over HTTP, Media presentation description and
segment formats", Working Draft is available as MPEG Output w18641.
[102]
www.videoservicesforum.org/activity_groups/RIST_poster_for_VidTrans2018Feb25.pdf
[103]
ANSI/SCTE 130-3 2013, Digital Program
Insertion-Advertising Systems Interface Part 3 https://www.scte.org/documents/pdf/Standards/ANSI_SCTE%20130-3%202013.pdf
Replace Clause 8 with
the following
This clause provides an overview of guiding use cases considered in the context of ad insertion for DASH. The initial focus is on use cases addressed in clause 1.1.1.3 together with the transition issues in clause 1.1.1.7.
In future version of this document, the remaining use cases will be addressed. However, the tools documented in this clause may very well be used for ad insertion for all documented use cases.
In this case content is statically defined and made available on demand to clients. Ad insertion takes place at pre-defined placement opportunities within the content. Opportunities are located at conventional pre-, mid-, and post-roll positions within the content.
No restriction is placed on the duration of the inserted ads. Service providers may choose to fill the opportunities when the client first requests content and/or when the client playout approaches the opportunity location. Service providers may also choose to skip an opportunity, in which case content will seamlessly continue.
If possible, content should be preconditioned such that segment boundaries are created at placement opportunities.
In this case content is being made periodically available to clients as part of a live event. Placement opportunities are signalled by the content author via in-band cues such as SCTE-35. Service providers may have the right to replace a subset or all of the placement opportunities that occur.
Opportunities will have an explicit expected duration announced with them and may come with little to no pre-warning. Inserted advertisements will replace in stream content and should exactly match the expected opportunity duration to avoid delaying the main content.
While opportunities are generally expected to match the announced duration, in practice opportunities may be terminated early by the content author in response to the occurring event. In this case, the main content will take priority and the inserted advertisement will be cut short at the point of in-stream opportunity termination.
In addition to early termination, opportunities may be extended by the content author in response to the occurring event. In this case, the service provider may elect to return to the main stream and use the original in stream content for the remainder of the break or treat the extension as a new opportunity and fill the announced extended duration.
Service providers may choose to skip a replacement opportunity entirely, in which case the original in stream content will be played instead.
If in-band cues are used to signal opportunities, the content encoding should produce exact segment boundaries at the cue points.
In this case content is a capture of a live stream that is made available on demand to clients. Placement opportunities are the same that occurred during the original live event. Service providers may have the right to replace a subset or all of the placement opportunities that occur.
Opportunities have an explicit duration and default content associated with them. Inserted advertisements will replace the default content and may vary in duration from the original content.
Service providers may choose to skip a replacement opportunity, in which case the default content will be played instead. Service providers may also choose to remove a placement opportunity, in which case the content before the opportunity will seamlessly transition to the content after the opportunity.
In this case a service provider desires to present an advertisement prior to entering a live stream. The advertisement is a static asset that is available on demand to clients and the live stream is being made periodically available to clients as part of a live event.
The advertisement may be of any duration desired and is not associated with any conditioning or markers in the live stream.
Following the playout of the advertisement, the client will join the live stream and no longer be able to access the original advertisement.
In this case a service provider wishes to present an advertisement with a content stream, but does not wish for the advertisement to be detectable by the client. To accomplish this the advertisement may be stitched into content prior to packaging and manifest generation such that there is a single asset produced containing the stitched assets.
This use case is not currently in the scope of this document as the single asset will result in less interoperability challenges. However, the DASH-IF Working Group is continuing to study effective obfuscation methods and practices within DASH and will provide information in future editions of this document.
A complex issue in the playback of ad content in combination with main content is the transition between the two contents. This transition should happen in a smooth and seamless manner such that the user does not observe discontinuities, quality changes, audio glitches, rebuffering or other artefacts. DASH provides different signaling mechanisms to indicate how content is offered. In many times, it is then a question on the capability of the underlying playback platform, whether it can handle such content in a smooth and seamless manner or if issues and problems are expected to occur. At such splice points, different issues may happen in general, some of them listed below:
· Timeline discontinuities: in order to avoid rewriting content, the inserted content may not follow the timeline of the main content. However, this can be handled in DASH at Period boundaries.
· Overlaps or possibly even gaps or of content on the master timeline: The content may not exactly be matching the envisaged insertion instructions and hence content may be overlapping at the splice points or there may be gaps. DASH permits signaling of such properties and playback platforms can handle the playback of content with such properties.
· Encryption and key changes: In case of DRM protected content, changes of the encryption or of keys may happen at splice points. DASH permits signaling of such properties and playback platforms can handle the playback of content with such properties.
· Codecs changes: Ads may be prepared with different codecs than the main content. This may result in complex codec change operations and not all platforms can handle such operations. DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.
· Codec profile/level changes: Similarly, to the above, ads may be prepared with different codec profiles or levels. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.
· Signal changes (HDR/SDR, 4K/HD, Stereo/5.1): Ad content may be offered with different signal properties, for example the resolution of the video may changes, the color space or transfer charactistics may changes, or in audio, the channel configuration may change. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.
· Addition or removal of a track (e.g. a language, subtitle): At ad or program boundaries, certain tracks or sub-assets may not be available, for example a specific language may not be available, the content may not provide subtitles, or even the offering in a certain format or codec may not be available. Again, DASH permits signaling of such changes, but also allows playback platforms to identify their capabilities, whether they can handle such changes or not.
This specification addresses three aspects in the context of the above:
1) The signalling of the DASH formatted content at splice points on what changes may happen
2) Certain requirements on DASH formatted content in order to support playback on a majority of devices
3) The ability of signaling the required capabilities for a playback platform in order to seamlessly playback the content.
1.1.2.
Definitions
ABR Encoder: live encoder that
converts a broadcast stream or mezzanine into a ladder of different bit-rate tracks.
Ad Avail Processor: logical service that,
given cue data, determines the placement of advertisement content within a
stream and describes the necessary ad decision service communication
Ad Content Server: server storing the ad
content and serving it on a per request basis.
Ad Creative: linear visual and auditory asset that represents the content of an advertisement
Ad Decision Service: functional entity that decides which ad(s) will be shown to the user. It interfaces deployment-specific and are out of scope for this document.
Ad Insertion MPD Manipulator: functional entity that
proxies a DASH MPD and may change it to insert the ad creative in the streaming
presentation. It may also embed other ad related metadata, or remove ad related
metadata in the mpd.
Ad Pod: location or point in time where one or more ad slots may be scheduled for delivery; same as ad break, avail, and placement opportunity; pre-, mid-, and post- prefix may be used to denote pod location relative to content as before, during, and after respectively.
Ad Reporting Server: functional entity for
collecting viewer impressions of advertisement content.
Ad Slot: single ad creative that is one of possibly many others that make up an ad pod
CDN node: functional entity returning a segment on request from DASH client. There are no assumptions on location of the node.
CMAF packager: functional entity, often residing with the ABR Encoder, which packages the adaptive bit-rate tracks into CMAF tracks.
DASH Ad resolver: functional entity which
returns one or more remote elements on request from DASH client.
DASH Access Client: client consuming the
DASH stream, possibly also contains functionality for client side ad insertion
and viewer impression reporting.
DASH Ad resolver: functional entity which
returns one or more ad creatives in a dash formatted construct on request from
a DASH Access client.
DASH Packager: functional entity that
processes conditioned content and produces media segments suitable for
consumption by a DASH client. This entity is also known as fragmenter,
encapsulater, or segmenter.
DASH-IF Ad Content: Content that follows
specific restrictions and requirements according to this specification to be
independently produced and inserted into well-formated main content by simple
MPD manipulation processes.
MPD Generator: functional entity
returning an MPD on request from DASH client. It may be generating an MPD on
the fly or returning a cached one.
Origin: functional entity that
contains all media segments indicated in the MPD, and is the fallback if CDN
nodes are unable to provide a cached version of the segment on client request.
Reference Playback
Platform: reference platform for playback (e.g. HTML-5
MSE/EME)
Server-Side Ad Insertion (SSAI): ad serving architecture that interleaves content and ad assets prior to the stream reaching the client.
Server-Guided Ad Insertion (SGAI): ad serving architecture that fully describes ad opportunities within content prior to the stream reaching the client, but has the client resolve opportunities as needed.
Splice Point: point in media content where its stream may be switched to the stream of another content, e.g. to an ad.
Tracking Event: data payload associated with an ad creative that is emitted by an application when a specific time point or criteria is met during the creative playout.
In
the context of DASH-IF guidelines, primarily two architectures are considered.
In the Server-Side Ad Insertion (SSAI) architecture, the ad is inserted in the
network before reaching the DASH Client. In the Server-Guided Ad Insertion
(SGAI) architecture, information about ad placement and resolution is inserted
in the network, but final resolution is done on demand by the DASH client. The
architectures share a significant amount of the functions and interfaces
documented in Figure 1.
Figure 1 DASH-IF Ad Insertion Architecture
In
this document, requirements and recommendations are provided for different
interfaces. The main focus of the work are the interfaces to and from the DASH
client. However, network interfaces and functions are also discussed as they
impact the processing in certain functions.
Note 1: The above diagram combines the
MPD and Segment servers, in a refined version they may be separated.
Note 2: The latency of each
function/interface may be provided in a revised version. Input is welcome.
Note 3: The interface names are only
numeric. Should DASH-IF provide more instructive names and if so, feedback is
welcome on commonly agreed names for each of the interfaces. This is discussed
in https://github.com/Dash-Industry-Forum/AdInsertion/issues/40,
please join the discussion.
An
overview of the functions and interfaces are provided in clause 1.1.3.
The Ad Insertion architectures start with the ingest of an input stream over IF-0 which is processed by an ABR Encoder and output as well-formed CMAF content over the IF-1 interface. A DASH Packager / MPD Generator uses IF-1 input to generate a conformant DASH content presentation that is sent over IF-2 and additional opportunity metadata that is sent (IF-3).
An Ad Insertion MPD Manipulator uses the inputs of IF-2 and IF-3 to generate a DASH presentation that is a mixture of content and advertisements. In the SSAI architecture, the manipulator uses IF-4 to ask an Ad Decisioning / Content Server to provide advertisement placements for the content stream before generating the final DASH MPD for IF-5 which contains metadata about the inserted ads via IF-6. In the SGAI architecture, the manipulator does not immediately use IF-4, instead it embeds opportunity information from IF-3 into the DASH MPD IF-5 output so that the DASH Client may later use IF-7 to retrieve the proper ad placements.
The DASH Client utilizes the reference media pipeline provided by IF-9 to perform seamless playout of the mixed content and ad presentation obtained via IF-5. Ad measurement and tracking is enabled in the client by IF-8 utilizing the ad metadata embedded as part of IF-6.
In Table 1 the interfaces defined are detailed with section references and some example instantiations. Each interface section will provide an informative overview of said interface and where aspects of the interface falls within the scope of this document, normative requirements will be provided.
Table 1 Interfaces identified in the ad insertion architecture, example instantiations and references within the document
Interface |
Function |
Example instantiations |
Reference |
IF-0 |
ABR Stream Source |
MPEG-2 TS, RIST |
1.2.1 |
IF-1a |
Packager Ingest Media |
DASH Ingest interface 1, azure smooth ingest,
CMAF |
1.2.2 |
IF-1b |
Packager Ingest Metadata |
DASH Ingest interface 1 metadata, azure smooth
ingest metadata |
1.2.2 |
IF-1c |
Configuration Parameters |
See for example DASH-IF IOP v4.3 and LL-DASH extensions |
1.2.2 |
IF-2 |
Content Preparation |
MPEG DASH, IOP v4.3. |
1.2.4 |
IF-3 |
Ad Avail Signalling |
SCTE-214.X, CableLabs |
1.2.5 |
IF-4a |
Ad Decisioning Parameters |
This specification |
1.2.6 |
IF-4b |
Ad Content Conditioning Parameters |
This specification |
1.2.6 |
IF-4c |
Dynamic Ad Content Format |
This specification |
1.2.6 |
IF-4d |
Ad Storage Format |
This specification |
1.2.6 |
IF-4e |
Ad Selection Result |
VAST/VMAP, SCTE-130 |
1.2.6 |
IF-5 |
MPD and Segments with Ad Placement |
MPEG DASH, IOP v4.3, this specification |
1.2.7 |
IF-6 |
Ad Metadata Signalling |
MPEG DASH, IOP v4.3 |
1.2.8 |
IF-7 |
Remote Resolution with
Decisioning Parameters |
MPEG DASH, IOP v4.3 |
1.2.9 |
IF-8 |
Ad Tracking and Measurement |
VAST, Open Measurement SDK |
1.2.10 |
IF-9 |
Reference Media Playback and
Content Decryption |
HTML-5 video, MSE, EME, CTA
WAVE Device Playback Platform |
1.2.3 |
The formatting and delivery of media input to the ABR encoder is described by IF-0. The ad insertion architectures in this document are agnostic to the choice of this interface instantiation and as such information in this section shall be considered informational.
Example interface instantiations may differ depending on the type of media input being supplied to the architecture. For example, a VOD workflow may utilize a mezzanine delivery format such as the CableLabs Video-On-Demand Content Specification [93], while a LIVE workflow may utilize a contribution feed delivery format such as MPEG-2 TS [94], RIST [99].
For any instantiation, it is usually beneficial for the media input to contain descriptive metadata about the media input such that the ABR encoder may provide conditioning of the encoded output and pass-through said information to components later in the ad insertion streaming architecture. As the format of descriptive metadata may be workflow specific, the examples provided below should be considered informational only.
In a LIVE workflow, the descriptive metadata may consist of program, segmentation, and splicing information, we will refer to this information as broadcast events. Examples of what broadcast events signal are program start/end, chapter start/end, interstitial, distributor start/end, provider break start/end, content identification, and many others. SCTE-35 or SCTE-104 are examples of standards to insert such broadcast events aligned with the media presentation in IF-0. In Figure 3, we show a segmentation of a live input based on SCTE-104/35 [54].
Figure 3 shows a live broadcast with segmented broadcast information based on broadcast events. In this case broadcast events are used to segment and can optionally be used to signal ad breaks. Nevertheless, more information is carried about the broadcast streams. The placement opportunities are shown in green. For more information relating to the commands supported we refer to [54].
Figure 3: segmented live broadcast with broadcast events [54]
In a VOD workflow, content is delivered to a
service provider by a content provider as a package of various assets and
metadata that make up the full description of the content. This package
contains mezzanine assets that streamable assets may be produced from, but may
also contain still image cover art, promotional assets, and preview trailer
assets. The metadata provided alongside assets include basic information such
as title, genre, and rating, but also includes advanced metadata such as
chapter locations, distribution subscriber requirements, and distributor ad
preservation requirements. One format of this package is described by the
CableLabs Video-On-Demand Content Specification [93], which we defer to for
further information about package structure and data.
Media provided in mezzanine or ingest is assumed
to have a continuous media time and the timestamp of the media carries through
the ABR encoder for each media type as shown in Figure 2. In addition, splice points are defined and at
these splice points, at a specific media time tsplice, the ABR
encoder is expected to prepare the content accordingly in order to permit
splicing. The reason and details of each splice point and the conditions may be
carried through but are irrelevant for the media preparation.
Figure 2 Abstracted Media Model with splice
points.
The ABR encoder provides encoded variants of the media input and prepares CMAF conforming headers, chunks and fragments as defined in ISO/IEC 23000-19 [99] , organized in CMAF structures such as CMAF Tracks and Switching Sets. The content may also be provided together with an MPD that follows the DASH Profile for CMAF content as defined in ISO/IEC 23001-9 [101]. This reflects what is documented with IF-1a inFigure 1.
Those CMAF prepared content is assumed to be properly annotated through metadata. The metadata carries information that can be used by the DASH packager for specific information. This reflects what is documented with IF-1a in Figure 1. A recommended protocol for the combination of the two, IF-1a and IF-1b interface, is the DASH-IF Ingest Spec [92] (CMAF ingest interface).
In addition, the service follows certain service configuration options that are provided by external means. The configuration may include information such as the nominal CMAF fragment duration (DASH segment duration), CMAF chunk duration, number and bitrates in a CMAF Switching Set, codec configurations and media profiles, etc. This reflects what is documented with IF-1c in Figure 1.
The definition of this interface IF-1 is outside of the scope of this document, but in the following several assumptions on the generated media being provided to the DASH packager are taken, pre-dominantly that the encoder produces well-formated CMAF conforming content [99]. Note that these assumptions are not a requirement for this specification, but a service provider should understand the downstream system effects if the packager ingest does not follow these assumptions. For example a transcoding or timeline corrections needs to be done in the DASH packager to meet the output requirements for following interfaces, or a specific addressing scheme may have to be used.
The following assumptions are taken:
· The ABR encoder produces continuous content with a single CMAF Header for each CMAF Track. There may be instances that in between two potential splice points at media times tsplice,i and tsplice,i+1 not all Tracks/Switching Sets are provided. However, at least a minimum set of Switching Sets are always present.
· For those CMAF tracks that are present for the entire program, the media time is continuous, also across splice points. This means that the subset of continuously present CMAF Switching sets of the entire program conforms to a CMAF presentation as defined in ISO/IEC 23000-19, clause 7.3.6.
o Note: this assumption
may be relaxed, but if done, there needs to be a signaling for such a
discontinuity. Input on this subject would be welcome.
· There are three options for content in between two potential splice points at media times tsplice,i and tsplice,i+1.
o For Option 1 referred to as "Splice-Conditioned Packaging", the following holds:
§ The output of the ABR encoder in between two potential splice points at media times tsplice,i and tsplice,i+1 is CMAF conforming, i.e. it conforms to a CMAF presentation as defined in ISO/IEC 23000-19 [99], clause 7.3.6.
§ The first splice point at tsplice,i is the timeline origin of all CMAF tracks in the CMAF presentation as defined in ISO/IEC 23000-19 [99], clause 7.3.6.
Note: this does not imply that each splice point resets the timeline. Indeed this would contradict the first assumption above that media is time-continous.
§ The ABR encoder provides content that can be converted to conforming DASH content, for example consistent CMAF Fragment duration to enable proper usage of DASH Segment duration signaling, bitrate characteristics for signaling in the MPD, event messages, etc.
§ The ABR encoder creates a CMAF Fragment boundary for all CMAF Tracks at tsplice,i and resets the CMAF Fragment duration from here on.
Note: This permits Period boundary insertion at tsplice,i without modification of the CMAF content.
§ The ABR encoder creates media for all samples of all CMAF tracks in between tsplice,i and tsplice,i+1 with tsplice,i being included and tsplice,i+1 being excluded.
Note: This permits to create a Period that is fully covered by content.
§ The ABR encoder creates a CMAF Fragment boundary for all CMAF Tracks at tsplice,i+1 and resets the CMAF Fragment duration from here on.
Note: This permits Period boundary insertion at tsplice,i+1 without modification of the CMAF content.
o For Option 2 referred to as "Splice-Conditioned Encoding", the following holds:
§ The ABR encoder creates a SAP type 1 or 2 at tsplice,i with TSAP set to tsplice,i. The placement of the SAP type 1 or 2 may not and typically does not co-incide with a CMAF Fragment boundary.
§ The ABR encoder creates a SAP type 1 or 2 at tsplice,i+1 with TSAP set to tsplice,i+1. The placement of the SAP type 1 or 2 typically does not co-incide with a CMAF Fragment boundary.
o For Option 3 referred to as "Splice Point Signaling", no specific encoding and packaging is done at the splice points.
§ It may be the case that an exact alignment of a SAP type with the splice point may not be possible, for example due to the codec or format properties. However, additional SAP types may be available, or the the media can be accessed quickly by other means, for example by accelerated decoding.
· The ABR encoder passes through timed metadata (from contribution/production feed IF-0) related to the provided descriptive metadata and content conditioning, including the signaling and timing of each splice point tsplice,i.
· The content may be provided at once, for example as part of a VoD Asset generation, or the content may be provided by the ABR encoder on a continuous timeline, for which real-time and media time advance in concurrently.
Note: Slice points are defined independent whether you enter or exit the content. Please provide feedback if this differentiation should be added in the final version of the document.
It may be also the case that within one content generation work flows, certain media encoding follows option 1 whereas others may follow option 2 or option 3. For example, video may follow option 1, and audio may be encoded based on option 3.
The three options for encoder and packager configuration are shown in Figure 3. In option 1, CMAF Fragment boundaries are aligned with splice points, and in option 2, splice points may occur in the middle of a CMAF Fragment, but are supported by a SAP type 1/2 for random access. In option 3, no SAP type 1 or 2 is necessarily provided at the splice point.
Note: As an example, please note that CMAF Fragment#3 in Option1 may be shorter or it may be even longer than CMAF Fragment #2 in order to align Splice Points with CMAF Fragment Boundaries.
Figure 3 CMAF Encoder and Packager options
Another
important assumption in the context of this specification is the availability
of a reference playback platform that enables a DASH client to use for media
playback and decryption. Without limiting the usage of any DASH player, this
assumption permits that content is authored such that platforms with certain
restrictions can be used.
The
DASH Client interacts with the media pipeline on the reference platform via the
IF-9 interface. The definition of this interface is out of the scope of this
document, but the general assumption of the DASH-IF IOP is an MSE [96] / EME
[97] reference pipeline.
Furthermore,
it is assumed, for interoperability and/or robustness of this interface, that
the reference playback platform supports the playback requirements defined by
the Consumer Technology Association Web Application Video Ecosystem Project
(CTA Wave) Device playback specification [98].
Specifically, in the context of this specification, a playback platform is expected to support playback requirements as documented in clause 8 of CTA-WAVE 5003 [98] for any content conforming to a CMAF Switching Set according to CMAF media profile included in an MPD, namely
- 8.2 Sequential Track Playback
- 8.3 Random Access to Fragment
- 8.4 Random Access to Time
- 8.5 Switching Set Playback
- 8.8 Playback over WAVE Baseline Splice
Constraints
- 8.13 Restricted Splicing of Encrypted
Content
- 8.14 Sequential Playback of Encrypted and Non-Encrypted Baseline Content
If
a playback platform wants to consume content authored according to encoding and
packaging option 2 or 3, ("Splice-Conditioned Encoding" and
"Splice Signaling", respectively) as defined in clause 1.2.2 for content conforming to a CMAF Switching Set
according to CMAF media profile included in an MPD, is expected to support the
following playback requirements as documented in clause 8 of CTA-WAVE 5003 [98].
- 8.9 Out-Of-Order Loading
- 8.10 Overlapping Fragments
Finally,
it is assumed that the reference playback platform can be used in order to
query proper capabilities such that MPD information can be transformed into
capability queries, e.g. if a codec is supported. Device Capability queries is
discussed in clause 6.4 of CTA-WAVE 5003 [98].
Note: A revised version will add more details on device capabilities requirement are expected in an updated version.
The format and requirements of the DASH manifests and segments output by
the DASH Packager / MPD Generator for use later in the ad insertion
architecture is described by IF-2. The DASH IOP Guidelines provide the general
normative requirements on the DASH output and we will assume those as a
baseline set of requirements. Here we will provide further normative
requirements for the ad insertion architectures.
Generally, for each known ad splice point, the DASH Packager/MPD Generator should insert a Period boundary.
The recommendation of Period boundary generation at splice points within the DASH Packager / MPD Generator is made such that the downstream Ad Insertion MPD Manipulator can perform replacements and insertions on the MPD-level only without accessing the content segments. Should the DASH Packager / MPD Generator not be aware of what splice points are appropriate for ad insertion, the Period boundaries may be omitted and instead be created by the downstream Ad Insertion MPD Manipulator, further details of this operation are provided as part of IF-5.
In
the following it assumed that at least one media type (typically video) follows
the content generation according to clause 1.2.2, option 1 ("Splice-Conditioned
Packaging") and furthermore it is assumed:
· Each CMAF fragment generates one DASH Segment
· The content is provided in a live session, i.e. CMAF fragments are made available to the DASH packager once completed.
o NOTE: Low-Latency
operation will be added in the final version of the document in alignment with
the Low-Latency DASH extensions.
· The minimum splice point advance notice time is known, i.e. the DASH packager gets a pre-notification or ad avail for a splice point that will be added to the media. This allows the DASH Packager to configure the minimum update period of the MPD properly. By this, the DASH client or MPD proxy requests the MPD in high enough frequency such that none of the announced Periods in the MPD are missed. For example, in SCTE-35, it is recommended to provide an advance notice of at least 4s [54]
o NOTE: please provide
feedback on practicability and other examples during community review phase. This
is discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/46,
please join the discussion.
then
a DASH packager produces content by (i) generating an initial MPD, and (ii)
dynamic operation of the packager including MPD processing/updates and Segment
offering.
The
initial MPD is generated as follows:
· The CMAF data is mapped to the MPD using the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101].
· For every CMAF Switching Set that is known to be offered in the MPD, an Initialization Set as defined in ISO/IEC 23009-1 [101], clause 5.12 should be added that describes all known static parameters for the CMAF Switching Set, preferably based on the information in the CMAF Master Header (i.e. a CMAF Header that is sufficient to initialize the media pipeline for continuous playback, see CTA WAVE 5003 [98] for details) for this CMAF Switching Set.
o Every Initialization Set gets assigned a unique id.
o For every CMAF Switching Set that is not known to be offered on a continuous basis, the @inAllPeriods of the Initalization Set is set to false.
o For every CMAF Switching Set that is known to be offered on a continuous basis, the @inAllPeriods of the Initalization Set is set to true or the attributed is omitted.
· The MPD@availabilityStartTime is set to an arbitrary value.
· The @mininumUpdatePeriod is set sufficiently small such that DASH clients and MPD proxies do not miss Periods created for announced splice points taking into account the minimum splice point advance notice time.
· The initial MPD follows the Main content in Table 1.
For
every splice point i at time tsplice,i, a
new Period is generated as follows:
· If it is the first Period in the presentation and the media is "starting" to be produced, then
o @start of the Period is set to NOW - @availabilityStartTime with some possible margins to address different Segment availability times, for example due to publication delay on a CDN.
· If it is not the first Period in the presentation
o @start of the Period is set to the sum of the value of Period@start of the previous period and the interval between the two splice points (tsplice,i - tsplice,i-1)
o Period continuity is signaled across all Adaptation Sets that are continuing across the Period boundary. Preferably the same signaling and track structure is used.
· Every available CMAF Switching Set in the CMAF Presentation is mapped to one Adaptation Set using the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101]. Within one Adaptation Set the following parameters are set
o The @timescale attribute is set to the timescale of the CMAF Track
o The @presentationTimeOffset is set to tsplice,i normalized by the timescale.
o The @eptDelta and, if applicable, the @startNumber or SegmentTimeline is set to indicate the placement of the first Segment in the Period. Note if the content follows option 1, then @eptDelta is set to 0 and can be absent.
o Period continuity is signaled to indicate, which Adaptation Set follows continously the previous one.
o If the CMAF switching set is identical to one for which an Initialization Set was set, then all parameters from the Initialization Set are copied into this Adaptation Set and the @initializationRefId is set to the one referred to.
Note
1: Instead of assumption, this content may be changed into requirements. These
requirements may then be signalled with a specific profile.
Note
2: This operation does not consider aspects such as inconsistent/variable
segment durations within a CMAF Presentation, upstream losses or errors, etc.
Any of such occurrences may result in additional Periods that may be added
according to the DASH-IF IOP guidelines.
The
mapping is shown in Figure
4.
Figure 4 CMAF Fragment to DASH Mapping for Option 1 and 3
NOTE 3: Signaling of content encoding
options 1 and 3 is for further study. Examples are how to signal @eptDelta, @presentationTimeOffset, etc.
Table
1 defines the Main live content MPD. More details need to be added.
Table 1 DASH-IF Main live content MPD
Element or Attribute
Name |
Use |
Description |
|||||
MPD |
|
Provides the requirements for DASH-IF
main content. Any not specified value is identical to what is provided in
ISO/IEC 23009-1 [10x],
clause 5.3.1. |
|||||
|
ServiceDescription |
0 … N |
|
||||
|
|
Latency@TargetLatency |
O |
A target latency may be provided |
|||
|
@profiles |
M |
should include a profile indicator signaling http://dashif.org/guidelines/dashif-main-live-content (hopefully v5 defines
this) and should include a profile identifier for the
DASH CMAF profile "urn:mpeg:dash:profile:cmaf:2019"profile if all content is encoded following option 1. |
||||
|
@type |
M |
Shall be set to dynamic |
||||
|
@minBufferTime |
M |
Shall be present |
||||
|
@suggestedPresentationDelay |
R |
Shall not be present |
||||
|
@maxSegmentDuration |
R |
Shall not be present |
||||
|
InitializationSet |
1 … N |
At least one shall be present |
||||
|
ProgramInformation |
0…N |
This should be used to describe
information about the main content. More details may be provided |
||||
|
Period |
1 … N |
One or more Periods shall be present. Provides the requirements for main
live content. Any not specified value is identical to what is provided in
ISO/IEC 23009-1, clause 5.3.2. |
||||
|
|
@xlink:href |
R |
Shall be absent. |
|||
|
|
@xlink:actuate |
R |
Shall be absent. |
|||
|
|
@start |
M |
Shall be present. |
|||
|
|
@duration |
R |
Shall not be present |
|||
|
|
BaseURL |
0 |
Shall not be present |
|||
|
|
EventStream |
0...N |
specifies an event stream. <What
can we say about this? Event Streams that go across Periods?> This is
discussed in https://github.com/Dash-Industry-Forum/AdInsertion/issues/45,
please join the discussion. |
|||
|
|
AdaptationSet |
1...N |
At least one Adaptation Set shall be present. |
|||
|
|
|
@xlink:href |
R |
Shall be absent |
||
|
|
|
@xlink:actuate |
R |
Shall be absent |
||
|
|
|
|
SegmentBase@presentationTimeOffset |
OD |
shall be set to the correct value of the presentation time of the Adaptation Set at the start of the Period, if the presentation time is not equal to 0. |
|
|
|
|
@contentType |
M |
Shall be present |
||
|
|
|
SegmentList |
0 |
Shall be absent |
||
|
|
|
Representation |
1 … N |
specifies a Representation. At least one Representation element shall be present in each Adaptation Set. Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.3. |
||
|
|
Subset |
0 |
Shall be absent. |
|||
|
|
EmptyAdaptationSet |
0 |
Shall be absent |
|||
|
UTCTiming |
1 … N |
At least one shall be
present |
||||
Opportunity metadata is made up of the original descriptive metadata of the input media related to signalling of ad opportunities and the content segmentation information generated by the DASH Packager. Carriage of opportunity metadata in the presentation output by the DASH Packager / MPD Generator is done via IF-3.
The following normative statements on opportunity metadata carriage are made:
· Opportunity metadata shall be carried through DASH MPD Events
The requirement of MPD Events over other carriage mechanisms is made such that the downstream Ad Insertion MPD Manipulator can perform insertions without accessing the content segments.
Note: Please check the above requirement during community review and comment if this overconstrains deployments.
While the carriage method is considered normative, the format of the metadata is workflow dependent. Examples of known schemes are provided in the subsequent sub-sections of this interface. For the purposes of this document, we will consider the carriage of opportunity metadata via SCTE-35 signalling sufficient, but other methods of equivalent means may be used.
SCTE-35 describes a set of command messages that can be utilized to describe ad opportunities within a presentation. Typically, the broadcast events for a LIVE presentation are already signalled as SCTE-35 commands which may be directly used, but a VOD workflow may optionally synthesis a series of SCTE-35 commands to describe the conditioning and opportunities in a VOD presentation as well.
The SCTE 214 specification defines a set of event schemes for carrying SCTE-35. The appropriate event scheme to use depends on the utilized DASH mechanism, DASH MPD Events may use either urn:scte:scte35:2013:xml or urn:scte:scte35:2014:xml+bin.
An example of carrying SCTE-35 with MPD Events using the urn:scte:scte35:2014:xml+bin scheme is shown in Table 2. The timing described within the SCTE-35 payload provide the event properties @id, @presentationTime, and @duration properties. The id may be used to filter out duplicate events. In this example the payload is encoded using Base64 enclosed in the <Binary> tag per SCTE 214.
Table 2 Example of a SCTE-35 message embedded as an MPD event using SCTE 214
<EventStream
schemeIdUri="urn:scte:scte35:2014:xml+bin" timescale="1"> <Event presentationTime="1540809120" duration="24" id="1999"> <Signal
xmlns="http://www.scte.org/schemas/35/2016"> <Binary>/DAhAAAAAAAAAP/wEAUAAAfPf+9/fgAg9YDAAAAAAAA/APOv</Binary> </Signal> </Event> </EventStream> |
Information about ad content to insert into a presentation is retrieved from the Ad Decision and Ad Content server(s) via the IF-4 interfaces. The request from the Ad Insertion MPD Manipulator for ad content provides all the information needed to perform ad decisioning, including content metadata and opportunity descriptions. The response is then translated by the Ad Insertion MPD Manipulator into the DASH structures detailed in IF-5.
There are many details corresponding to the functions of ad requests and ad decisioning, as such this document identifies 5 different sub-interfaces that outline primary interactions of ad requests and decisioning.
The identified 5 sub-interfaces are:
- Interface IF-4a: Ad Decision request parameters. For details see 1.2.6.2.
- Interface IF-4b: Content Conditioning request parameters. For details see 1.2.6.3.
- Interface IF-4c: Recommended Dynamic Ad Content response format. For details see 1.2.6.7.
- Interface IF-4d: Recommended Ad Content Storage format. For details see 1.2.6.6.
- Interface IF-4e: Ad Selection Result format. For details see 1.2.6.5.
1.2.6.2.1. Decisioning Parameters
A decisioning parameter is a piece
of information about the content stream, consumption medium, or end user that
is used by the Ad Decisioning Server as part of the advertisement qualification
and selection process. The Ad Insertion MPD Manipulator collects and sends this
information to the Ad Decisioning Server as part of IF-4a.
The transmission of decisioning parameters is highly integration dependent, but examples of commonly used industry parameters are:
·
Content
Unique Identifier
·
Content
Genre
·
Content
Language
·
Service
Provider Identifier
·
Device
Type (TV, SetTop, Mobile, Computer, etc)
·
Device
Manufacturer
·
Device
Model
·
End
User IP Address
· End User Zip Code
1.2.6.2.2. Decisioning Modes
The decisioning mode of an Ad Decisioning Server dictates how the server chooses to fulfill ad requests made by a caller. The Ad Insertion MPD Manipulator must specify the decisioning mode for the Ad Decisioning Server to use via IF-4a based on the implemented ad insertion architecture. There are two general modes of ad decisioning that the SSAI and SGAI architectures respectively enable: stream level decisioning and pod level decisioning.
With stream level decisioning, all advertisement opportunities are decided prior to DASH client receiving the stream. A SSAI architecture accomplishes this by having the Ad Insertion MPD Manipulator send the IF-3 supplied opportunity metadata to the Ad Decision server via IF-4a. The result of the ad decision request will contain advertisements for the entirety of the stream which the Ad Insertion MPD Manipulator transforms into an IF-5 manifest a mixture of content and advertisements.
After a DASH client receives a stream produced from an SSAI architecture, the stream will remain fixed for the duration of the playback session, e.g. the same advertisements will play again should the user choose to rewind the stream.
With pod level decisioning, advertisement opportunities are decided just as the DASH client reaches the opportunity within the stream. A SGAI architecture accomplishes by having the Ad Insertion MPD Manipulator use the IF-3 supplied opportunity metadata to generate an IF-5 manifest with a mixture of content and remote entities that represent opportunities. As the client reaches remote entities during playout, the client utilizes IF-7 to return the opportunity metadata to the Ad Insertion MPD Manipulator which then sends the data to the Ad Decision server via IF-4a. The result of the ad decision request will contain advertisements for this single opportunity which the Ad Insertion MPD Manipulator transforms into an IF-7 response for the client to consume.
After a DASH client receives a stream produced from an SGAI architecture, the stream can continue to change for the duration of the playback session, e.g. the advertisements can be re-decisioned should the user choose to rewind the stream.
A conditioning parameter is a piece
of information about the encoding/packaging of the content stream or a client
player capability that is used by the Ad Content Server to ensure an ad
creative is compatibly encoded for inclusion in the generated presentation. The
Ad Insertion MPD Manipulator collects and sends this information to the Ad
Content Server as part of IF-4b.
The transmission of conditioning parameters
is highly integration dependent, but examples of commonly used industry
parameters are:
·
Video
/ Audio Codecs
·
Player
Splice Condition Robustness
·
Encryption
Schemes
1.2.6.5.1. Overview
The response of the Ad Decision Server identifies the advertisements decisioned by the server and provides information associated with the advertisement such as general metadata, viewability requirements, media files, mezzanines, and tracking events. The actual ad content is provided by the Ad Content Server, preferably following the DASH-IF Ad Content format as defined in clause 1.2.6.6. Depending on the decisioning mode, the decision response may optionally contain the placement and ordering of advertisements as well.
While the general information carried in the response is described above, the explicit format of this response is workflow dependent. Examples of known industry formats are given in the subsequent sub-sections of this interface. For the purposes of this document we will assume VAST/VMAP is used, but other formats with equivalent data communication may be used.
1.2.6.5.2. IAB VAST and VMAP
An instantiation of IF-4e is standardized by the Interactive Advertising Bureau (IAB) as the Digital Video Ad Serving Template (VAST) [53] and Video Multiple Ad Playlist (VMAP) [52] specifications.
The VAST specification provides structure definitions for representing a variety of ad types, including linear, non-linear, and companion. A single VAST response may contain a stand-alone ad slot or a whole pod of ad slots, and each ad structure can provide general metadata, viewability requirements, media files, mezzanines, and tracking events.
The VMAP specification is a complement to the VAST specification as it describes a playlist structure that wraps one or more VAST ad responses to provide ad decisions for an entire stream. A VMAP response will provide the order and position that ad pods should occur in the content stream and may also provide additional tracking events for pod level tracking.
1.2.6.5.3. SCTE-130
Another instantiation of IF-4e is standardized by the Society of Cable Telecommunications Engineers (SCTE) as the response of the Ad Decision Service (ADS). The ADS is responsible for determining how advertising content is combined with non-advertising content. The exact format and schema of the ADS response is normatively defined within SCTE-130 Part 3 [103], which we defer to for further information.
This
interface provides a recommended content format for ad content that is expected
to be dynamically inserted into a DASH live or on demand Media Presentation.
Ad
content is recommended to follow the DASH-IF Ad content format as defined in
the following. This specification does not exclude the use of other content,
but the content author should be aware of any differences to the DASH-IF Ad
Content format.
DASH-IF Ad follows the restrictions and requirements according
of this specification and may produced independently of the main content for insertion
into well-formated main content by simple MPD manipulation processes.
If content is offered conforming to the DASH-IF Ad content format and follows the following requirements and recommendations, then it may annotate with a @profiles parameter: "http://dashif.org/guidelines/dashif-ad-content".
The
following requirements for DASH-IF Ad Content apply:
- The content shall be provided as a DASH Media Presentation, i.e. a complete MPD with referenced Segments and shall follow the semantics in Table 2.
- The DASH Media Presentation shall conform to the DASH profile for CMAF content as defined in ISO/IEC 23009-1 [101].
Note: An important assumption for the above profile is the availability of content for CMAF Tracks over the entire Period. Content may be overlapping at the start of the Period or at the end of the Period.
- The DASH Media Presentation shall contain exactly one Period.
- The MPD@type shall be set to 'static'.
NOTE: Please provide comments if additional restrictions and requirements would be considered useful. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/48.
The following recommendations for DASH Ad Content apply:
- The MPD should contain a profile indicator signaling "http://dashif.org/guidelines/dashif-ad-content"
- The content should be offered using the Segment timeline.
- The Segment durations within on Adaptation Set should be approximately identical.
- The content may and typically should include multiple variants for the same ad, for example different codecs, formats and resolutions in order for the Dynamic Conditioning, the MPD proxy or a DASH client to adjust the ad to the current playback conditions.
NOTE: Please provide comments if additional recommendations would be considered useful or if any of the recommendations should be removed or made a requirement. Examples are guidelines for exact duration signaling or encrypted content. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/48.
Table 2 DASH-IF Ad content MPD
Element or Attribute
Name |
Use |
Description |
||||
MPD |
|
Provides the requirements for ad
insertion content. Any not specified value is identical to what is provided
in ISO/IEC 23009-1 [10x],
clause 5.3.1. |
||||
|
@profiles |
M |
should include a profile indicator signaling http://dashif.org/guidelines/dashif-ad-content and
shall include a profile identifier for the DASH CMAF profile "urn:mpeg:dash:profile:cmaf:2019"profile. This also means that the content follows the CMAF
Profile. |
|||
|
@type |
M |
Shall be set to static |
|||
|
@mediaPresentationDuration |
R |
Shall not be present. |
|||
|
@minimumUpdatePeriod |
R |
Shall not be present, implied by type static. |
|||
|
@minBufferTime |
M |
Shall be present |
|||
|
@timeShiftBufferDepth |
R |
Shall not be present |
|||
|
@suggestedPresentationDelay |
R |
Shall not be present |
|||
|
@maxSegmentDuration |
R |
Shall not be present |
|||
|
@maxSubsegmentDuration |
R |
Shall not be present |
|||
|
ProgramInformation |
0…N |
This should be used to describe
information about the ad. More details may be provided |
|||
|
BaseURL |
0 |
Shall not be present. If a Base URL is present, then it is as part of the Period |
|||
|
Period |
1 |
Exactly one Period shall be present. Provides the requirements for ad
insertion content. Any not specified value is identical to what is provided
in ISO/IEC 23009-1, clause 5.3.2. |
|||
|
|
@xlink:href |
R |
Shall be absent. |
||
|
|
@xlink:actuate |
R |
Shall be absent. |
||
|
|
@start |
R |
Shall be absent, i.e. assumed to be 0. |
||
|
|
@duration |
O |
is set to the duration of the ad content |
||
|
|
BaseURL |
1…N |
At least one shall be present and refer to the BaseURL of the ad content. |
||
|
|
EventStream |
0...N |
Event Streams are permitted in ad content, for example for beaconing. Note: more details need to be added on specific types. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/49 |
||
|
|
AdaptationSet |
1...N |
At least one Adaptation Set shall be present. |
||
|
|
|
@xlink:href |
R |
Shall be absent |
|
|
|
|
@xlink:actuate |
R |
Shall be absent |
|
|
|
|
InbandEventStream |
0...N |
Inband Event Streams are permitted in ad content, for example for beaconing. Note: more details need to be added on specific types. Please join the discussion here: https://github.com/Dash-Industry-Forum/AdInsertion/issues/49. |
|
|
|
|
CommonAttributesElements |
— |
specifies the common attributes and elements (attributes and elements from base type RepresentationBaseType). For details, see subclause |
|
|
|
|
|
SegmentBase@presentationTimeOffset |
OD |
shall be set to the correct value of the presentation time of the Adaptation Set at the start of the Period, if the presentation time is not equal to 0. <we make a recommendation that we make it zero, ask for community review.> |
|
|
|
@contentType |
M |
Shall be present |
|
|
|
|
SegmentList |
0 |
Shall be absent |
|
|
|
|
Representation |
1 … N |
specifies a Representation. At least one Representation element shall be present in each Adaptation Set. Any not specified value is identical to what is provided in ISO/IEC 23009-1, clause 5.3.3. |
|
|
|
Subset |
0 |
Shall be absent. |
||
|
|
EmptyAdaptationSet |
0 |
Shall be absent |
||
|
UTCTiming |
0 |
Shall not be present |
|||
|
LeapSecondInformation |
0 |
Shall not be present |
|||
Figure 5 provides and overview of the DASH-IF Ad content format.
Figure 5 Recommended Ad Content Format
The response format should follow the DASH-IF Ad Content
format as defined in clause 1.2.6.2.
Note: Please
provide feedback if the dynamic response should permit multiple Periods. If no clear use
case is provided, single Period is restricted.
If no conditioning parameters are
provided, the response should include multiple content variants, e.g. multiple
codecs, resolutions, etc..
If codec conditioning parameters are provided, the
response should include content options including at least one of the codecs.
If format conditioning parameters are provided, the
response should include content options including at least one of the supported
formats.
If encryption conditioning parameters are provided,
the response should include content options including at least one of the
supported encryption modes.
If the content needs to be obfuscated/blocked, then
the response should be adjusted to the main content.
If the content needs to be served through xlink with
Period, then only the Period of the main content is extracted.
If the content is used on live and especially
low-latency live services, then the content in the Period should be adjusted to
enable consistent playback including consistent join times.
Note: This
clause will need more refinements and comments are welcome.
The format and requirements of the DASH manifest
output by the Ad Insertion MPD Manipulator is described by IF-5 as shown in Figure 6. The Ad Insertion
MPD Manipulator operates under the following assumptions:
1)
It
takes the DASH content provided by IF-2 as defined in clause 1.2.4.
2)
Opportunity
metadata is provided by IF-3 as defined in clause 1.2.5.
a.
The
splice point timing is known as tsplice-in for when the
opportunity starts media time level.
b.
The
duration of the ad insertion opportunity is known as ADDU
c.
At
tsplice-in
+ ADDU,
another splice point exists in the main media to rejoin the main content.
d.
Each
splice point is signaled with a Period boundary and tsplice-in matches the Period@start of the main content at
the opportunity and tsplice-out matches the Period@start of the main content
when transitioning back.
3)
Properly
conditioned ad content is received via communication on IF-4 as defined in
clause 1.2.6.
Based on this input the MPD manipulator produces a Media
Presentation such that the content on IF-5 can be played back by the DASH
client. The transport of additional ad metadata along with linear ad creatives
is described as part of IF-6.
Figure 6 MPD Manipulator Operation: Conforming IF-5
output
In this first version of the document the following
assumptions are taken:
1)
The
main content on IF-2 follows the definitions in clause 1.2.4 with the assumption of using
option 1 for at least one continuous Switching set and option 3 for the
remaining ones on IF-1. Note that all Switching Sets may follow option 1
conditioning.
2)
The
ad content on IF-4b follows the DASH Ad insertion content as defined in clause 1.2.6.6. In addition, the content may be conditioned,
for example such that for any Initialization/Adaptation Set present in the MPD
on IF-2, at least one Adaptation Set is provided in the ad content on IF-4b.
3)
The
ad content on IF-4b has at least the duration ADDU, but may be marginally
longer. If longer, then it is expected that the content can be cut at the end.
Note that the above assumptions are not a requirement
for this specification. However, MPD proxy implementation/implementer should be
aware that if the content is not following the above requirements on what the
consequences are for the operation of the service.
The DASH IOP Guidelines provide the general normative
requirements on the DASH output and we will assume those as a baseline set of
requirements. Here we will provide further normative requirements for the ad
insertion architectures:
·
The
MPD shall contain multiple Period elements describing the content and
advertisement(s) of the presentation
·
Content
and/or advertisement assets split across two or more Periods must have period-connectivity
signaled
Note: this needs more detailed requirements. The final
version will add this.
The ad content is spliced into the main content as a
Period following Table 3.
<We need TargetLatency.>
Table 3 Ad Content spliced into main content
Element or Attribute
Name |
Use |
Description |
||||
MPD |
|
Provides the requirements for
content that is combined between main content an ad content. Any not
specified value is identical to what is provided in ISO/IEC 23009-1, clause
5.3.1. |
||||
|
ServiceDescription |
|
|
|||
|
|
Latency@TargetLatency |
0 |
Target Latency is provided |
||
|
@profiles |
M |
More details need
to be added |
|||
|
Period (Main content) |
|
specifies the information of a Period. The information from the main content Period is reused except specified differently |
|||
|
|
@duration |
R |
Shall not be present, the duration is determined by the @start of the Ad Period. |
||
|
|
EventStream |
0...N |
re-used from the main content. |
||
|
|
AdaptationSet |
1...N |
re-used from the main content. |
||
|
Period (Ad Content) |
|
specifies the information of a Period. The information from the Ad content Period is reused except specified differently |
|||
|
|
@id |
M |
A unique identifier, preferably re-used the one already present in the main content. |
||
|
|
@start |
M |
is set to tsplice-in from the main content |
||
|
|
@duration |
O |
is set as follows: - if the value from the ad content is larger than or equal to the duration of the ad slot, i.e. ADDU, then this value is set to ADDU. - if the value from the ad content is smaller than the duration of the ad slot, i.e. ADDU, then <what to do?>. |
||
|
|
BaseURL |
1…N |
re-used from the remote Ad content Period unless the ad content is moved elsewhere. |
||
|
|
@availabiltimeTimeOffset=<ad duration> |
|
|
||
|
|
EventStream |
0...N |
re-used from the remote Ad content Period unless proxy decides to remove based on business rules. Note: we also permit copying Event Streams from the
main content during this ad Period and if so which ones Decisions which ones is an implementation logic. Please provide feedback on this. |
||
|
|
AdaptationSet |
1...N |
re-used from the remote Ad content Period, but only a subset may be picked based on the main content or information from the client. For more details on how Adaptation Sets are added, please refer to Table 3. |
||
|
Period (main content) |
|
specifies the information of a Period. The information from the Ad content Period is reused except specified differently |
|||
|
|
@start |
M |
is set to tsplice-out from the main content |
||
|
|
@duration |
R |
shall not be present |
||
|
|
EventStream |
0...N |
Re-used from main content. |
||
|
|
AdaptationSet |
1...N |
re-used from main content. |
||
<Editor's
Note: This clause needs some updates on MPD Proxy operation.>
In a SSAI architecture the Ad Insertion MPD Manipulator
may utilize IF-4 to request an advertisement decision for part or all of the
content stream. For some embodiments, the ad placements provided by an ad
decision will be represented as a set of ad pods positioned within the content
stream with each pod containing a series of one or more ad slots. In this case,
the manipulator will create a new Period for each ad slot with a sequence of
Periods representing a complete ad pod. In other embodiments, the ad placements
for a stream may have only one creative for the entire pod, in this case the manipulator
will create a single new Period for each pod.
The manipulator inserts the Period
element(s) representing the pod(s) at the content stream position specified by
the ad decisioning response. If the content MPD provided to the manipulator
already has Period boundaries created at the desired ad insertion points, it
may directly insert the Period(s) representing the ad placements. If the
manipulator is provided a content MPD without Period boundaries at the desired
insertion points, the manipulator must split the content Period(s) utilizing
the functions described in the Period Splitting section of the IOP [ref].
Splitting operations may require the
manipulator to access the content segments in addition to the content MPD and
for this reason it is recommended that the content MPD is provided with Period
boundaries already generated. If the Ad Insertion MPD Manipulator does not
perform an insertion, it may choose to remove Period boundaries by recombining
the Periods, this may also be done by the manipulator to provide responses to
DASH clients that cannot handle Multi-Period responses.
For scenarios where original
in-stream ads are being replaced by the Ad Insertion MPD Manipulator instead of
inserted into a clean stream, the manipulator would create the same Period(s)
for ad creatives and replace sections of the content stream. If the ads are
already delineated in the content MPD then the in-stream Periods are replaced
by the generated ones, if the ads are not delineated in the content MPD then
the manipulator must perform splitting operations prior to replacement.
In a SGAI architecture the Ad
Insertion MPD Manipulator utilizes the opportunity metadata provided by IF-3 to
embed signals into the MPD response that allow the DASH client to defer ad
opportunity resolution until it is actively needed for the presentation.
Note: The DASH-IF Working Group
is actively studying the DASH mechanisms appropriate for enabling these signals
and will provide further information in future IOP updates.
<Editor's Note: This clause is
expected to be completed in the final version.>
<Assume TargetLatency and
document playback>
1.2.8.1. Introduction
Ad metadata, such as creative descriptions,
viewability requirements, and tracking events, is provided to the Ad Insertion
MPD Manipulator as part of the IF-4 ad decisioning response. To enable client
usage of this metadata, the manipulator may provide the metadata in the
presentation via IF-6. Similar to IF-3, the following normative statements on
ad metadata carriage are made:
·
Ad
metadata must be carried through one of the following mechanisms:
o
DASH
MPD Events [ref]
o
DASH
Inband Event Messages [ref]
·
DASH
MPD Events should be the preferred carriage mechanism
The
recommendation of MPD Events over other carriage mechanisms is made such that
the Ad Insertion MPD Manipulator can provide ad metadata to the DASH client
without modifying the ad segments.
For
both mechanisms, the DASH client aligns the surfacing of the event data to the
client application with the timed playout of the Period, see DASH-IF Event
Processing [ref] for further information.
The
format and usage of ad metadata is integration specific and is therefore out of
the scope of this document. Known metadata event schemes are provided in the
subsequent sub-sections of this interface as informational examples.
1.2.8.2.
DASH
Callback Event
MPEG-DASH
devices a basic callback event scheme denoted by the scheme id “urn:mpeg:dash:event:callback:2015” and value=1. When a DASH client encounters this scheme
it will treat the message data payload of each event as a URI and perform a GET
request ignoring the response. This functionality was designed to directly
facilitate the requirements of basic timed tracking events, such as those
establish in the IAB VAST specification.
1.2.9.1. Introduction
In a SGAI architecture, instead of providing ad
placements directly, the MPD provided via IF-5 contains information on how to
resolve the ad placements as the DASH client needs them for playout of the
presentation. The interface to enable this late resolution is provided via
IF-7.
The DASH-IF Working Group is actively studying the
DASH mechanisms appropriate for enabling this interface and will provide
normative information in a future update of the IOP. The subsequent subsections
of this interface detail initial thoughts on this interface and should be
considered informational only.
1.2.9.2. Late Binding via Remote Periods
MPEG-DASH
defines the XLink mechanism [ref] for enabling remote elements within an MPD.
The Remote Period variant of remote elements can be used in a SGAI architecture
to delay the resolution of ad opportunities. In particular a Remote Period with
@xlink:actuate=“onRequest” can be inserted into an MPD as an ad pod placeholder
and the DASH client will perform resolution as the portion of the timeline
containing the Remote Period is approached during presentation playout. The
response for the resolution can contain one or more Periods to represent one or
more ad slots within the ad pod.
Remote
elements must be further studied for interoperability guidelines, in particular
the usage of remote elements within a dynamic MPD has not been studied
sufficiently to understand restrictions and constraints that such an MPD would
impose on remote elements.
1.2.9.3. Decisioning Parameters via URL Parameters
Just as in a SSAI architecture, an SGAI architecture
must provide the decisioning and conditioning parameters to the Ad Decisioning
/ Ad Content Server via IF-4, but unlike SSAI, the usage of IF-4 is done as
needed instead of pre-emptively. This means that the service entity handling
the late resolution of the ad opportunity must be made aware of the parameters
that were previously known to the Ad Insertion MPD Manipulator.
MPEG-DASH defines the Flexible Insertion of URL
Parameters mechanism to enable the dynamic creation of request parameters by
the DASH client, combining information provided in the manifest and information
available from other requests. As the Ad Insertion MPD Manipulator is
constructing the MPD provided via IF-5, it may utilize the Flexible Insertion
mechanism to embed the content decision and conditioning parameters known to it
such that they are properly transmitted to the late resolution handler without
the need for server-side state.
In addition to the parameters known at original MPD
generation time, further investigation is being conducted by the DASH-IF
Working Group to determine if player runtime conditions could be additionally
included via this mechanism to provide greater detail around active seamless
playout requirements.
1.2.10.1.
Introduction
It is common practice for advertisements to utilize impression tracking from the client to report and measure the number of times an ad is viewed. IF-6 provides the carriage of metadata to enable tracking scenarios that are described by IF-8. Ad tracking and measurement integrations can be very workflow and client platform dependent and are therefore out of the scope of this document. Known tracking mechanisms are provided as informational references.
1.2.10.2. VAST
View Tracking
The
IAB VAST specification describes a Tracking element which provides a URI that should
be requested when a named event occurs within a creative. A subset of these
events describe progression through a linear creative by tracking points of
time within the creative, such as start, end, first quartile, second quartile,
etc. As these events are directly timed with the playout of the media, the DASH
event mechanisms can be used to convey and align these events with the linear
creative.
When
translating these named events to timed events, the presentation time of the
event should correlate to the same logical position as the VAST named event,
for instance the start event should be a presentation time of 0 relative to the
start of the Period containing the ad creative. If no client application
processing is required of the event, the DASH Callback Event scheme may be used
to have the DASH client directly perform requests, see section 1.2.8.2 for
further details. Should a service provider wish to handle the request on their
own, to further process the URI or provide it to a third-party library, a
custom event scheme may be established by the provider and utilized by the
client application.
1.2.10.3. Open Measurement SDK
The
IAB Tech Lab has produced the Open Measurement SDK [95] as a way of
facilitating third-party viewability and verification measurements without
requiring SDKs from individual measurement providers. Service providers
integrate the Open Measurement SDK into their client applications and the SDK
facilitates the execution of measurement provider defined tracking parameters
for each ad creative.
For
each ad creative played the SDK must be instantiated with ad metadata
describing the measurement providers and their parameters for the creative.
This metadata may be carried via IF-6 using a custom event scheme and surfaced
to the client application which can then initialize the SDK for the creative.
1.2.10.4. Alternative Tracking Methods
Other tracking
services could be used to track viewer impressions, such as proprietary or open
solutions, that enable reliable tracking of viewer impressions.