sandeep-aggarwal's picture
End of training
3337c4e verified
metadata
language:
  - en
license: apache-2.0
tags:
  - sentence-transformers
  - sentence-similarity
  - feature-extraction
  - generated_from_trainer
  - dataset_size:3825
  - loss:TripletLoss
base_model: BAAI/bge-large-en-v1.5
widget:
  - source_sentence: >-
      Adobe Workfront is the leader in Collaborative Work Management and Project
      and Portfolio Management, and has been recognized by Gartner, Forrester,
      IDC, Constellation Research, and more for its current capabilities and
      future vision.  Relying on instant messages, post-it notes, and meetings
      to keep everyone in the loop create information silos. Adobe Workfront
      tracks threaded conversations, status updates, and feedback in one place
      so communication is kept in context of work, and there’s no confusion or
      wasted time.  Project team members can collaborate on documents,
      timesheets and time usage, attach notes, comments, calendars, and record
      meeting discussions in custom forms in projects, tasks, issues, etc.
      Integration with collaboration tools like Teams and Slack are also
      natively available.   In addition, Workfront Proof provides the ability
      for internal and external users to collaborate on documents, videos,
      websites and over 150 file types.
    sentences:
      - >-
        Is there a pre-built report that provides a visual representation of all
        the marketing program interactions that impacted the leads linked to a
        specific account and opportunity?
      - >-
        Is real-time collaboration offered? Does your platform include features
        such as real-time persistent chat, shared whiteboards, team meetings
        with audio, video, and screen sharing, as well as VoIP and telephony
        integration?
      - >-
        This is supported with the schema defined as part of the Experience Data
        Model (XDM). Changes to a schema, data set and data ingestion process
        can be carried out without breaking or invalidating past
        ingestions.https://experienceleague.adobe.com/docs/experience-platform/xdm/schema/composition.html#evolution
  - source_sentence: >-
      Real-Time CDP enables you to create robust, centralized profiles
      containing customer attributes and timestamped events each customer
      interaction across systems integrated with Adobe Real-Time CDP. The format
      and structure of this data is provided by Experience Data Model (XDM)
      schemas, with each schema being based upon an XDM class and containing
      fields that are compatible with that class.Schemas can be created for
      multiple use cases, referencing the same class but containing fields
      specific to their use. When a schema is enabled for Profile, it becomes
      part of a union schema. In other words, union schemas are composed of
      multiple schemas that share the same class and have been enabled for
      Profile. The union schema enables you to see an amalgamation of all of the
      fields contained within schemas sharing the same class. Real-Time CDP uses
      the union schema to create a holistic view of each individual customer.For
      more information on union schemas, visit
      https://experienceleague.adobe.com/docs/experience-platform/profile/union-schemas/union-schema.html?lang=en
    sentences:
      - >-
        In what ways does your platform's inventory facilitate the development
        of a data inventory for regulatory compliance?
      - >-
        The tool should have the capability to completely automate campaign
        outputs, encompassing everything from segment refresh to the generation
        of the output file.
      - >-
        You can manage campaigns across multiple channels in Adobe Marketo
        Engage, including email, social, paid media, SMS, in-app messaging and
        web. If there is a channel that Marketo Engage doesn't natively support
        (such as Direct Mail), you can leverage one of our pre-built
        integrations with partner technologies in our best-in-class partner
        ecosystem. This means that as your marketing strategy grows and you
        extend your presence across other channels, you won't need to worry
        about finding and integrating a point tool and making sure the data is
        getting passed correctly.Marketo Engage is designed to be a
        multi-channel marketing platform at its core. The platform offers
        numerous capabilities that make it easy to configure multi-channel
        campaigns and align communication to match customer preferences for
        channel. Due to the design and layout of the system, Marketo Engage
        consistently scores higher on ease-of-use and time-to-value by analysts
        and customers alike compared to other platforms.Because you're able to
        listen and engage across multiple channels using Marketo Engage, that
        means you're also able to report on which channels and campaigns are
        performing the best during different stages of the customer lifecycle.
  - source_sentence: "Adobe's Secure Product Lifecycle (SPLC) was designed from the ground up and integrated into multiple stages of the product lifecycle to help keeping customer information safe and secure. It is comprised of\_a\_rigorous set of several hundred specific security activities spanning software development practices, processes and tools. Adobe SPLC controls include, depending on the specific Adobe product or service, some or all of the following recommended practices, processes and tools:- Security training and certification for product teams- Product health, risk, and threat landscape analysis- Secure coding guidelines, rules and analysis- Service roadmaps, security tools, and testing methods that guide the security team to help address the Open Web Application Security Project (OWASP) Top 10 most critical web application security flaws and CWE/SANS Top 25 most dangerous software errors- Security architecture reviews and penetration testing- Source code reviews to help eliminate known flaws that could lead to vulnerabilities- User-generated content validation- Static and dynamic code analysis- Application and network scanning- Readiness reviews, response plans, and release of developer education materials."
    sentences:
      - >-
        Rephrased question: 


        Commitment to Good Coding Practices 


        a) Astro and third parties should commit to making reasonable efforts to
        adhere to good coding practices. 


        b) Compliance with recognized industry standards, such as those
        established by The Open Web Application Security Project (OWASP)
        Foundation, is recommended. 


        c) Astro and third parties should agree to adhere to a defined set of
        secure coding guidelines that specify the expected formatting,
        structure, and commenting of code. 


        d) Thorough commenting is required for all security-relevant code. 


        e) Guidance on how to avoid common security vulnerabilities must be
        included. 


        f) Prior to being considered ready for unit testing, all code must
        undergo review by at least one additional Developer to ensure it meets
        the security requirements and coding guidelines. 


        g) Developers must provide and follow a security test plan that outlines
        the approach for testing and confirming compliance with each security
        requirement. 


        h) Developers will execute the security test plan and, if necessary for
        audit purposes, present the test results to Astro. 


        i) Developers agree to deliver secure configuration guidelines that
        comprehensively explain all security-related configuration options and
        their implications for the overall security of the software.
      - >-
        The Service Provider will deliver maintenance services that encompass
        both manual and automated patch management, as well as the application
        of patches that have received approval from the Agency.
      - >-
        While offering a decoupled architecture, with ContextHub, composition
        middleware, and front-end management capabilities, Adobe Experience
        Manager comes as a fully integrated solution that customers can use
        end-to-end or via leveraging portions of the platform. For example, only
        as a ContextHub (headless CMS) or only as composition middleware.As part
        of Adobe Experience Cloud, Adobe Experience Manager offers on
        always-up-to-date solution that isn't limited for extensibility, to fit
        the solution into the enterprise solution landscape. With the move to a
        cloud-native platform architecture, Adobe has preserved the
        extensibility for which the Experience Manager platform is known and
        respected.Delivering globally scalable experiences to a distributed
        workforce requires an architecture that natively includes the
        cloud-based edge technologies. With our CMS platform, edge computing is
        an integral part of the architecture. Keep in mind, this is not the same
        as leveraging CDNs (everybody can do that). This means that Adobe
        Experience Manager has been re-architected to run time-sensitive
        workloads seamlessly on edge-based cloud platforms.Adobe Experience
        Manager got its name from providing approachable in-context editing
        capabilities. Adobe keeps investing to bring in-context editing to any
        surface, and was the first to market with decoupled, in-context editing
        in Single Page Applications with a lightweight SDK. With experimentation
        and personalization by default, Adobe provides new ways for brands and
        enterprises to have a minimal overhead to continuously optimize
        experiences.Deep integration with the Adobe Creative and Adobe Document
        Clouds allows for access to industry-leading content creation
        capabilities and asset / document services like e-signatures.
  - source_sentence: >-
      Functionality is provided as a native feature and is included in the base
      price of the commerce system Native support for customer-specific price
      books / catalogs, restrictions can be made by customer segment. Customer
      segments can also be assigned to specific websites, allowing for different
      segmentation strategies across multiple site. Segments can be applied to
      visitors, registered customers, or both, allowing for broad or specific
      targeting.
    sentences:
      - >-
        Adobe Experience Platform supports a number of identifiers which can be
        broadly classified in three categories-An identity such as a login ID,
        ECID, or loyalty ID is referred to as a known identity.PII such as email
        address and phone number, serves to directly identify a customer. As a
        result, PII is used to match a customer’s multiple identities across
        systems.Unknown or anonymous identities single out a device without
        identifying the actual person using it. This category includes
        information such as a visitor’s IP address and cookie ID. In addition,
        please see the above response DM-5 to understand how ID's are stitched
        together using the identity service.Reference
        Material:https://experienceleague.adobe.com/docs/experience-platform/identity/home.html?lang=en
      - >-
        In what ways does your company collect enhancement feedback from
        customers and engage them in determining the priority of future
        releases? Additionally, how does your company assess its effectiveness
        in meeting customer needs?
      - >-
        Capability to categorize the catalog and limit visibility of sections
        based on country, region, brand, business line, customer, segment, and
        role.
  - source_sentence: >-
      Adobe Experience Platform helps you create a Real-time Customer Profile
      for each customer record where you can see a holistic view of each
      individual customer by combining data from multiple sources, channels,
      including online, offline, CRM, and third party. Profile allows you to
      consolidate your customer data into a unified view offering an actionable,
      timestamped account of every customer interaction. Further, each data
      source or channel might work on different customer identity and will share
      multiple identities with the Platform. Identity Service helps you to gain
      a better view of your customer and their behavior by bridging identities
      across devices and systems, allowing you to deliver impactful, personal
      digital experiences in real time. The Platform creates an identity graph,
      a map of relationships between different identity namespaces, providing
      you with a visual representation of how your customer interacts with your
      brand across different channels. The data captured in the datsets is
      secure and cannot be accessed outside of the Real time Customer Profile
      and segmentation. << Customer name >> users which elligible to access the
      data as per access control, can only access the data.Reference material:
      Identity Service -
      https://experienceleague.adobe.com/docs/experience-platform/identity/namespaces.html?lang=enAccess
      Control -
      https://experienceleague.adobe.com/docs/experience-platform/access-control/home.html?lang=en
    sentences:
      - >-
        How is security handled in relation to a single customer view when we
        grant access to various business units? Is user data explicitly linked
        to the division that supplied the source data, or to the profile that
        has been identified as comprising data from that division?
      - Is it possible to prioritize tasks within the application?
      - >-
        Clients may capture new project or other work requests through any
        number of request queues that the client can configure. Adobe Workfront
        provides a help desk area of the application where request queues can be
        configured for the purpose of capturing, routing, and managing various
        requests. Client can configure request forms through the UI and forms
        can include both native and custom fields. Routing rules and approval
        processes can be designated for each specific request queue.  Project
        requests may also require a business case to be built for the requested
        project. Adobe Workfront allows clients to build business cases for
        projects and these business cases can be used to evaluate the merits of
        a project. Information captured in business cases can include (but is
        not limited to) project goals/objectives, planned costs (expenses and
        resource related), high-level resources estimates, alignment scorecard,
        potential risks, and any custom data fields the client chooses to add.
pipeline_tag: sentence-similarity
library_name: sentence-transformers

BGE large model

This is a sentence-transformers model finetuned from BAAI/bge-large-en-v1.5. It maps sentences & paragraphs to a 1024-dimensional dense vector space and can be used for semantic textual similarity, semantic search, paraphrase mining, text classification, clustering, and more.

Model Details

Model Description

  • Model Type: Sentence Transformer
  • Base model: BAAI/bge-large-en-v1.5
  • Maximum Sequence Length: 512 tokens
  • Output Dimensionality: 1024 dimensions
  • Similarity Function: Cosine Similarity
  • Language: en
  • License: apache-2.0

Model Sources

Full Model Architecture

SentenceTransformer(
  (0): Transformer({'max_seq_length': 512, 'do_lower_case': True}) with Transformer model: BertModel 
  (1): Pooling({'word_embedding_dimension': 1024, 'pooling_mode_cls_token': True, 'pooling_mode_mean_tokens': False, 'pooling_mode_max_tokens': False, 'pooling_mode_mean_sqrt_len_tokens': False, 'pooling_mode_weightedmean_tokens': False, 'pooling_mode_lasttoken': False, 'include_prompt': True})
  (2): Normalize()
)

Usage

Direct Usage (Sentence Transformers)

First install the Sentence Transformers library:

pip install -U sentence-transformers

Then you can load this model and run inference.

from sentence_transformers import SentenceTransformer

# Download from the 🤗 Hub
model = SentenceTransformer("bge-large-triplet-v1.5")
# Run inference
sentences = [
    'Adobe Experience Platform helps you create a Real-time Customer Profile for each customer record where you can see a holistic view of each individual customer by combining data from multiple sources, channels, including online, offline, CRM, and third party. Profile allows you to consolidate your customer data into a unified view offering an actionable, timestamped account of every customer interaction. Further, each data source or channel might work on different customer identity and will share multiple identities with the Platform. Identity Service helps you to gain a better view of your customer and their behavior by bridging identities across devices and systems, allowing you to deliver impactful, personal digital experiences in real time. The Platform creates an identity graph, a map of relationships between different identity namespaces, providing you with a visual representation of how your customer interacts with your brand across different channels. The data captured in the datsets is secure and cannot be accessed outside of the Real time Customer Profile and segmentation. << Customer name >> users which elligible to access the data as per access control, can only access the data.Reference material: Identity Service - https://experienceleague.adobe.com/docs/experience-platform/identity/namespaces.html?lang=enAccess Control - https://experienceleague.adobe.com/docs/experience-platform/access-control/home.html?lang=en',
    'How is security handled in relation to a single customer view when we grant access to various business units? Is user data explicitly linked to the division that supplied the source data, or to the profile that has been identified as comprising data from that division?',
    'Clients may capture new project or other work requests through any number of request queues that the client can configure. Adobe Workfront provides a help desk area of the application where request queues can be configured for the purpose of capturing, routing, and managing various requests. Client can configure request forms through the UI and forms can include both native and custom fields. Routing rules and approval processes can be designated for each specific request queue.  Project requests may also require a business case to be built for the requested project. Adobe Workfront allows clients to build business cases for projects and these business cases can be used to evaluate the merits of a project. Information captured in business cases can include (but is not limited to) project goals/objectives, planned costs (expenses and resource related), high-level resources estimates, alignment scorecard, potential risks, and any custom data fields the client chooses to add.',
]
embeddings = model.encode(sentences)
print(embeddings.shape)
# [3, 1024]

# Get the similarity scores for the embeddings
similarities = model.similarity(embeddings, embeddings)
print(similarities.shape)
# [3, 3]

Training Details

Training Dataset

Unnamed Dataset

  • Size: 3,825 training samples
  • Columns: positive, anchor, and negative
  • Approximate statistics based on the first 1000 samples:
    positive anchor negative
    type string string string
    details
    • min: 7 tokens
    • mean: 146.93 tokens
    • max: 512 tokens
    • min: 4 tokens
    • mean: 23.8 tokens
    • max: 128 tokens
    • min: 3 tokens
    • mean: 141.82 tokens
    • max: 512 tokens
  • Samples:
    positive anchor negative
    Adobe Commerce being an Open-source platform nurtures community of users to contribute learn and connect to our platform. There are over 400K+ developers and community members worldwide with Adobe Commerce development experience, and over 8,000 Certified Adobe Commerce Developers, who can support projects and implementations. This global community is truly dedicated to the growth of our platform and success of our customers. In addition, you can easily grow and scale your team because Adobe Commerce talent is easy to find. For more details, please see below: - https://business.adobe.com/in/products/magento/community.html# Can you provide an overview of the Adobe Commerce Developer Community? Streaming ingestion for Adobe Experience Platform provides users a method to send data from client and server-side devices to Experience Platform in real time. Streaming ingestion plays a key role in building real-time customer profiles by enabling <> to deliver Profile data into the Data Lake with as little latency as possible. The stream connector for Adobe Experience Platform is based on Apache Kafka Connect. This library can be used to stream JSON events from Kafka topics in <> data centre directly to Experience Platform in real time. The stream connector is a sink (one-way) connector, delivering data from Kafka topics to a registered endpoint on Experience Platform. The connector supports the following features:1. Authenticated collection of data2. Batching messages to reduce network calls and increase throughputFull documentation here: https://experienceleague.adobe.com/docs/experience-platform/ingestion/streaming/kafka.html?lang=en
    Adobe Commerce has extensive experience in the B2C environment. Our platform supports B2C business models out of the box and provides a range of features and capabilities to enhance the B2C customer experience. With Adobe Commerce, businesses can create personalized commerce journeys, boost conversion and sales with AI-powered merchandising tools, and provide a seamless and intuitive shopping experience for their customers. Can you explain your experience working in the B2C sector? Adobe’s vision is to empower companies to unify end-to-end customer experiences from creation to commerce, driving loyalty and business growth. Our company values — Create the future, Own the outcome, Raise the bar, and Be genuine — represent who we are, how we show up in the world, and how we’ll define our future success.
    Adobe Professional Services takes a phased approach in implementation. In the first phase which we call a “Design and plan” phase we define business requirements, features, and KPIs. We run a series of workshops in first 4-5 days to gather requirements and then design a best-in-class architecture considering your goals and capabilities, Integrations and customizations.  Key out comes of Design and plan phase : Defined Success Criteria and KPIs  Business Requirements  Data migration strategy Feature Matrix Technical Architecture - scale to future needs Catalog setup, customizations, and integrations. Detailed Roadmap We believe architecting the overall solution and key system integrations aligned to your business long term strategy is crucial to ensuring a successful commerce platform implementation. Please provide an overview of the workshop focusing on functionality, design, and architecture. Streaming segmentation on Adobe Experience Platform allows customers to do segmentation in near real-time while focusing on data richness. With streaming segmentation, segment qualification now happens as streaming data lands into Platform, alleviating the need to schedule and run segmentation jobs. This essentially ensures that the right customers are targeted in near real-time and they are added/removed from a digital marketing activity across various channels including Advertising ecosystems such as DSP, Social, Search etc
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Evaluation Dataset

Unnamed Dataset

  • Size: 956 evaluation samples
  • Columns: positive, anchor, and negative
  • Approximate statistics based on the first 956 samples:
    positive anchor negative
    type string string string
    details
    • min: 8 tokens
    • mean: 139.34 tokens
    • max: 512 tokens
    • min: 4 tokens
    • mean: 25.97 tokens
    • max: 234 tokens
    • min: 8 tokens
    • mean: 137.66 tokens
    • max: 512 tokens
  • Samples:
    positive anchor negative
    <> can import, edit, manage, as well as manually create profiles in Adobe Campaign.Using your data, your marketers can also use the powerful, user-friendly segmentation and targeting features to create highly targeted, differentiated segments through the easy-to-use, point and click interface. Segmentation can be based on an unlimited number of conditions utilizing the underlying marketing data, including historical customer transactions, demographics and marketing history.Once you have created your segments, the criteria logic used to create the lists can be saved as a Pre-Defined Filter. These filters are then available to reuse and select from a library of filters, eliminating the need to recreate the logic each time. You can then modify these pre-set filters and these filters will be applied dynamically during execution. The tool should have the capability to generate data profiles internally. There is no limit to the number of concurrent users (with different users types) that Adobe solutions can support. We also provide scalable environment leveraging our flexible architecture.
    Adobe does have anti-malware and anti-virus solutions installed on all workstations, as well as all Windows-based production servers. Adobe does not install anti-malware/anti-virus on Linux-based servers. Adobe has advanced security tools for Linux. Included in this toolset is file hash checking, centralized process monitoring, critical file monitoring, forced host hardening, and OS Query for real-time security investigations. The solution should include support for malware scanning. Yes, Adobe Customer Journey Analytics has Retention rates view and cohort tables that show the percentage of users that return after their initial engagement within the desired date range. Presently, calculated metrics and participation attribution settings can be used to calculate the time between events for particular users. Please see: https://experienceleague.adobe.com/docs/analytics-platform/using/guided-analysis/retention/retention-rates.html?lang=enPlease see here for information on Cohort Analysis: https://experienceleague.adobe.com/docs/analytics-platform/using/cja-workspace/visualizations/cohort-table/cohort-analysis.html?lang=en
    The user interface is customizable at the user level and allows the authorized admin users to customize it to meet business requirements. The platform provides a central web console configuration manager that allows administrators to configure the solution seamlessly. OSGi is a fundamental element in the technology stack of Adobe Experience Manager. It is used to control the composite bundles of AEM and their configuration. More details: https://experienceleague.adobe.com/docs/experience-manager-cloud-service/content/implementing/deploying/configuring-osgi.html?lang=en Can the user interface be customized for individual users or groups? If so, what aspects can be customized? Yes, in data centers, DDoS mitigation contracts are in place with telecommunications providers to leverage DDoS "scrubbers" should they be necessary.  In Public cloud provider locations, we leverage provided methodologies including auto expansion of capacity and DDoS mitigation where possible. Synthetic monitoring solutions including NewRelic, run synthetic transactions against our infrastructure to monitor application performance. When latency is detected, our 24x7x65 operations center is alerted and escalates with operational teams as necessary. For further information, please see the Infrastructure & Virtualization Security section in 3. CSA CAIQ v3.1 Adobe Experience Platform 2020 within the accompanying security pack
  • Loss: TripletLoss with these parameters:
    {
        "distance_metric": "TripletDistanceMetric.EUCLIDEAN",
        "triplet_margin": 5
    }
    

Framework Versions

  • Python: 3.11.11
  • Sentence Transformers: 3.4.1
  • Transformers: 4.48.3
  • PyTorch: 2.5.1+cu124
  • Accelerate: 1.3.0
  • Datasets: 3.3.2
  • Tokenizers: 0.21.0

Citation

BibTeX

Sentence Transformers

@inproceedings{reimers-2019-sentence-bert,
    title = "Sentence-BERT: Sentence Embeddings using Siamese BERT-Networks",
    author = "Reimers, Nils and Gurevych, Iryna",
    booktitle = "Proceedings of the 2019 Conference on Empirical Methods in Natural Language Processing",
    month = "11",
    year = "2019",
    publisher = "Association for Computational Linguistics",
    url = "https://arxiv.org/abs/1908.10084",
}

TripletLoss

@misc{hermans2017defense,
    title={In Defense of the Triplet Loss for Person Re-Identification},
    author={Alexander Hermans and Lucas Beyer and Bastian Leibe},
    year={2017},
    eprint={1703.07737},
    archivePrefix={arXiv},
    primaryClass={cs.CV}
}