88 datasets found
  1. h

    ubuntu_dialogs_corpus

    • huggingface.co
    Updated Mar 25, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    The HF Datasets community (2023). ubuntu_dialogs_corpus [Dataset]. https://huggingface.co/datasets/ubuntu_dialogs_corpus
    Explore at:
    Dataset updated
    Mar 25, 2023
    Dataset authored and provided by
    The HF Datasets community
    License

    https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

    Description

    Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

  2. h

    ubuntu_dialogue_qa

    • huggingface.co
    Updated Feb 27, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Richard Nagyfi (2023). ubuntu_dialogue_qa [Dataset]. https://huggingface.co/datasets/sedthh/ubuntu_dialogue_qa
    Explore at:
    Dataset updated
    Feb 27, 2023
    Authors
    Richard Nagyfi
    License

    MIT Licensehttps://opensource.org/licenses/MIT
    License information was derived automatically

    Description

    Dataset Card for "ubuntu_dialogue_qa"

    Filtered the Ubuntu dialogue chatlogs from https://www.kaggle.com/datasets/rtatman/ubuntu-dialogue-corpus to include Q&A pairs ONLY Acknowledgements This dataset was ORIGINALLY collected by Ryan Lowe, Nissan Pow , Iulian V. Serban† and Joelle Pineau. It is made available here under the Apache License, 2.0. If you use this data in your work, please include the following citation: Ryan Lowe, Nissan Pow, Iulian V. Serban and Joelle Pineau, "The… See the full description on the dataset page: https://huggingface.co/datasets/sedthh/ubuntu_dialogue_qa.

  3. P

    Ubuntu Chat Corpus Dataset

    • paperswithcode.com
    Updated Apr 23, 2013
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2013). Ubuntu Chat Corpus Dataset [Dataset]. https://paperswithcode.com/dataset/ubuntu-chat-corpus
    Explore at:
    Dataset updated
    Apr 23, 2013
    Description

    The Ubuntu Chat Corpus (UCC) is composed of archived chat logs from Ubuntu's Internet Relay Chat technical support channels. Ubuntu uses IRC as one of many modes of technical support -- it offers real-time problem solving. The authors have taken some of the archived messages (which are in the public domain), reorganized the file structure, removed some unnecessary system messages, and compressed them to make it easier to obtain.

  4. k

    Ubuntu-Dialogue-Corpus

    • kaggle.com
    Updated Aug 17, 2017
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2017). Ubuntu-Dialogue-Corpus [Dataset]. https://www.kaggle.com/rtatman/ubuntu-dialogue-corpus/discussion
    Explore at:
    CroissantCroissant is a format for machine-learning datasets. Learn more about this at mlcommons.org/croissant.
    Dataset updated
    Aug 17, 2017
    Description

    26 million turns from natural two-person dialogues

  5. E

    Ubuntu Dialogue Corpus

    • live.european-language-grid.eu
    csv
    Updated Dec 30, 2015
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2015). Ubuntu Dialogue Corpus [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/5115
    Explore at:
    csvAvailable download formats
    Dataset updated
    Dec 30, 2015
    License

    Apache License, v2.0https://www.apache.org/licenses/LICENSE-2.0
    License information was derived automatically

    Description

    Dialogues extracted from Ubuntu chat stream on IRC.

  6. Ubuntu Project

    • globaldata.com
    Updated Nov 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GlobalData UK Ltd. (2023). Ubuntu Project [Dataset]. https://www.globaldata.com/store/report/ubuntu-project-profile-snapshot/
    Explore at:
    Dataset updated
    Nov 13, 2023
    Dataset provided by
    GlobalData Ltd
    GlobalDatahttps://www.globaldata.com/
    Authors
    GlobalData UK Ltd.
    License

    https://www.globaldata.com/privacy-policy/https://www.globaldata.com/privacy-policy/

    Time period covered
    2023 - 2027
    Description

    The Ubuntu Project is a coal mine in South Africa. It is currently in operation.Empower your strategies with our Ubuntu Project report and make more profitable business decisions.Note: This is an on-demand report that will be delivered upon request. The report will be deliv Read More

  7. ubuntu

    • kaggle.com
    Updated Aug 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    hope531 (2022). ubuntu [Dataset]. http://doi.org/10.34740/kaggle/ds/2440949
    Explore at:
    Dataset updated
    Aug 28, 2022
    Dataset provided by
    Kagglehttp://kaggle.com/
    Authors
    hope531
    License

    http://www.gnu.org/licenses/lgpl-3.0.htmlhttp://www.gnu.org/licenses/lgpl-3.0.html

    Description

    FSE-2022 Artifact for paper Large-Scale Analysis of Non-Termination Bugs in Real-World OSS Projects.

    Install Our Artifact

    We provide an Ubuntu OVA(Open Virtualization Appliance), which provides necessary information for deploying virtual machines based on VirtualBox.

    -First, please download and install VirtualBox at following URL: https://www.virtualbox.org/wiki/Downloads

    -Open VirtualBox and import our ova file. Please set 15 GB RAM, and 8 processing units to ensure same configuration used in SV-COMP.

    Invocate State-of-the-art Termination Analysis Tools

    Open virtual machine and open terminal. usrName:ubuntu password:ubuntu

    If you want to get evaluate results of all five state-of-the-art termination analysis tools, please execute the following instructions. -cd /home/ubuntu/tool/result -./test.sh NOTICE: There are three resource limits for each verification run: a memory limit of 15 GB (14.6 GiB) of RAM, a runtime limit of 15 min of CPU time. Therefore,

    If you want to get evaluate results of specific state-of-the-art termination analysis tool (eg., Aprove), please execute the following instructions. -cd /home/ubuntu/tool/result -cd Aprove (get the result of Aprove) -cd loop (get loop result) -./test.sh

    All results are saved in /home/ubuntu/tool/result.

  8. threads-ask-ubuntu

    • zenodo.org
    json
    Updated Dec 16, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Landry; Nicholas Landry (2023). threads-ask-ubuntu [Dataset]. http://doi.org/10.5281/zenodo.10373311
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Dec 16, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Landry; Nicholas Landry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This is a temporal higher-order network dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. In this dataset, nodes are users on askubuntu.com, and a hyperedge comes from users participating in a thread that lasts for at most 24 hours. The timestamps are the time of the post, but normalized so that the earliest post starts at 0.

    Source of original data

    Source: threads-ask-ubuntu dataset

    References

    If you use this data, please cite the following paper:

  9. s

    CKAN Installation Tutorial on Ubuntu 20.04 - Dataset - STC Training Center

    • training.stcenter.net
    Updated Jan 28, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). CKAN Installation Tutorial on Ubuntu 20.04 - Dataset - STC Training Center [Dataset]. https://training.stcenter.net/dataset/ckan-installation-tutorial-on-ubuntu-20-04
    Explore at:
    Dataset updated
    Jan 28, 2022
    Description

    This video tutorial explains the process of installing CKAN on Ubuntu 20.04. This is general information for installing an operation system that can be used in a lot of other applications.

  10. O

    UDC (Ubuntu Dialogue Corpus)

    • opendatalab.com
    zip
    Updated Mar 17, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    McGill University (2023). UDC (Ubuntu Dialogue Corpus) [Dataset]. https://opendatalab.com/OpenDataLab/UDC
    Explore at:
    zipAvailable download formats
    Dataset updated
    Mar 17, 2023
    Dataset provided by
    University of Montreal
    McGill University
    Description

    This paper introduces the Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words.

  11. E

    Ubuntu

    • live.european-language-grid.eu
    tmx
    Updated Mar 26, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Ubuntu [Dataset]. https://live.european-language-grid.eu/catalogue/corpus/7339
    Explore at:
    tmxAvailable download formats
    Dataset updated
    Mar 26, 2022
    License

    Attribution-NonCommercial 4.0 (CC BY-NC 4.0)https://creativecommons.org/licenses/by-nc/4.0/
    License information was derived automatically

    Description

    EN-IS parallel corpus of Ubuntu localization files, 10,572 TUs, EN-IS, Domain: Software interface. The data originally came in aligned format from the Arni Magnusson Institute in Iceland. The following processing was performed: manual spot-check for quality.

  12. tags-ask-ubuntu

    • zenodo.org
    json
    Updated Nov 19, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Nicholas Landry; Nicholas Landry (2023). tags-ask-ubuntu [Dataset]. http://doi.org/10.5281/zenodo.10155835
    Explore at:
    jsonAvailable download formats
    Dataset updated
    Nov 19, 2023
    Dataset provided by
    Zenodohttp://zenodo.org/
    Authors
    Nicholas Landry; Nicholas Landry
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Overview

    This is a temporal hypergraph dataset, which here means a sequence of timestamped hyperedges where each hyperedge is a set of nodes. In this dataset, nodes are tags, and hyperedges are the sets of tags applied to questions on askubuntu.com. The timestamps are in ISO8601 format and are normalized to start at 0. This dataset is derived from tags on Ask Ubuntu posts.

    Statistics

    Some basic statistics of this dataset are:

    • number of nodes: 3,029
    • number of timestamped hyperedges: 271,233
    • distribution of the connected components:

    Component Size, Number

    • 3021, 1
    • 1, 8

    Source of original data

    Sources:

    References

    If you use this dataset, please cite these references:

  13. s

    CKAN Installation Guide on Ubuntu 20 - Dataset - STC Training Center

    • training.stcenter.net
    Updated Jan 30, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). CKAN Installation Guide on Ubuntu 20 - Dataset - STC Training Center [Dataset]. https://training.stcenter.net/dataset/ckan-installation-guide-on-ubuntu-20
    Explore at:
    Dataset updated
    Jan 30, 2022
    Description

    This guide covers the CKAN installation process, using Ubuntu dependency. Ubuntu is a Linux based open-source operating system. This is a general guide created by Jacob Cain.

  14. h

    my-test-dataset-ubuntu

    • huggingface.co
    Updated Apr 21, 2023
    + more versions
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    excode (2023). my-test-dataset-ubuntu [Dataset]. https://huggingface.co/datasets/excode/my-test-dataset-ubuntu
    Explore at:
    Dataset updated
    Apr 21, 2023
    Authors
    excode
    Description

    excode/my-test-dataset-ubuntu dataset hosted on Hugging Face and contributed by the HF Datasets community

  15. s

    Virtual Ubuntu Environment via VirtualBox - Dataset - STC Training Center

    • training.stcenter.net
    Updated Feb 24, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). Virtual Ubuntu Environment via VirtualBox - Dataset - STC Training Center [Dataset]. https://training.stcenter.net/dataset/virtual-ubuntu-environment-via-virtualbox
    Explore at:
    Dataset updated
    Feb 24, 2022
    Description

    This is a guide explaining how to install VirtualBox. VirtualBox is a tool which allows you to run different operating systems virtually on your host operating system. This is referred to as a virtual machine.

  16. T

    irc_disentanglement

    • tensorflow.org
    Updated Dec 10, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    (2022). irc_disentanglement [Dataset]. https://www.tensorflow.org/datasets/catalog/irc_disentanglement
    Explore at:
    Dataset updated
    Dec 10, 2022
    Description

    IRC Disentanglement dataset contains over 77,563 messages from Ubuntu IRC channel.

    Features include message id, message text and timestamp. Target is list of messages that current message replies to. Each record contains a list of messages from one day of IRC chat.

    To use this dataset:

    import tensorflow_datasets as tfds
    
    ds = tfds.load('irc_disentanglement', split='train')
    for ex in ds.take(4):
     print(ex)
    

    See the guide for more informations on tensorflow_datasets.

  17. w

    Data from: Mastering Ubuntu server

    • workwithdata.com
    Updated Jun 1, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2023). Mastering Ubuntu server [Dataset]. https://www.workwithdata.com/object/mastering-ubuntu-server-book-by-jay-lacroix-0000
    Explore at:
    Dataset updated
    Jun 1, 2023
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore Mastering Ubuntu server through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  18. w

    Linux bible : boot up to Ubuntu, Fedora, KNOPPIX, Debian, openSUSE, and 13...

    • workwithdata.com
    Updated May 13, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    Work With Data (2023). Linux bible : boot up to Ubuntu, Fedora, KNOPPIX, Debian, openSUSE, and 13 .. [Dataset]. https://www.workwithdata.com/object/linux-bible-boot-up-to-ubuntu-fedora-knoppix-debian-opensuse-13-other-distributors-book-by-chris-negus-1957
    Explore at:
    Dataset updated
    May 13, 2023
    Dataset authored and provided by
    Work With Data
    License

    Attribution 4.0 (CC BY 4.0)https://creativecommons.org/licenses/by/4.0/
    License information was derived automatically

    Description

    Explore Linux bible : boot up to Ubuntu, Fedora, KNOPPIX, Debian, openSUSE, and 13 .. through unique data from multiples sources: key facts, real-time news, interactive charts, detailed maps & open datasets

  19. Ubuntu – Evolve Condominium Towers – Colorado

    • globaldata.com
    Updated May 10, 2023
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    GlobalData UK Ltd. (2023). Ubuntu – Evolve Condominium Towers – Colorado [Dataset]. https://www.globaldata.com/store/report/ubuntu-evolve-condominium-towers-colorado-profile-snapshot/
    Explore at:
    Dataset updated
    May 10, 2023
    Dataset provided by
    GlobalDatahttps://www.globaldata.com/
    Authors
    GlobalData UK Ltd.
    License

    https://www.globaldata.com/privacy-policy/https://www.globaldata.com/privacy-policy/

    Time period covered
    2023 - 2027
    Area covered
    North America
    Description

    Equip yourself with the essential tools needed to make informed and profitable decisions with our Ubuntu – Evolve Condominium Towers – Colorado report.Note: This is an on-demand report that will be delivered upon request. The report will be delivered within 2 to 3 busin Read More

  20. b

    Linux Hint BD Events Database

    • linuxhintbd.blogspot.com
    csv, xml
    Updated Nov 3, 2022
    Share
    FacebookFacebook
    TwitterTwitter
    Email
    Click to copy link
    Link copied
    Close
    Cite
    LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT (2022). Linux Hint BD Events Database [Dataset]. https://linuxhintbd.blogspot.com/
    Explore at:
    xml, csvAvailable download formats
    Dataset updated
    Nov 3, 2022
    Dataset authored and provided by
    LLC/Linux/Hint/BD > National Centers for Environmental Information, Linux Hint BD, NOAA, B.D. Department of IT
    License

    https://www.linuxhintbd.xyzhttps://www.linuxhintbd.xyz

    Time period covered
    Jan 1, 1950 - Dec 18, 2013
    Area covered
    Pacific Ocean, North Pacific Ocean
    Dataset funded by
    Linux Server Support & IT Support Service
    Description

    Linux Hint BD Storm Data is provided by the National Weather Service (NWS) and contain statistics on...

Share
FacebookFacebook
TwitterTwitter
Email
Click to copy link
Link copied
Close
Cite
The HF Datasets community (2023). ubuntu_dialogs_corpus [Dataset]. https://huggingface.co/datasets/ubuntu_dialogs_corpus

ubuntu_dialogs_corpus

UDC (Ubuntu Dialogue Corpus)

Explore at:
Dataset updated
Mar 25, 2023
Dataset authored and provided by
The HF Datasets community
License

https://choosealicense.com/licenses/unknown/https://choosealicense.com/licenses/unknown/

Description

Ubuntu Dialogue Corpus, a dataset containing almost 1 million multi-turn dialogues, with a total of over 7 million utterances and 100 million words. This provides a unique resource for research into building dialogue managers based on neural language models that can make use of large amounts of unlabeled data. The dataset has both the multi-turn property of conversations in the Dialog State Tracking Challenge datasets, and the unstructured nature of interactions from microblog services such as Twitter.

Search
Clear search
Close search
Google apps
Main menu