AI/ML

We explain how to include large data files into the body of an executable program so that it's there when the program runs.

Check out the instructor-led labs in the Emerging Technologies track of Red Hat Summit 2019, coming up May 7-9 in Boston.

This presentation will cover two projects from sig-big-data: Apache Spark on Kubernetes and Apache Airflow on Kubernetes. We will give an overview of the current state and present the roadmap of both projects, and give attendees opportunities to ask questions and provide feedback on roadmaps.

Video of Kubecon 2018, explore the various integrations that have enabled Kubeflow to quickly emerge as the de-facto machine learning toolkit for Kubernetes.

https://www.youtube.com/watch?v=frQeK8xo9Ls Why Data Scientists Love Kubernetes - Sophie Watson & William Benton , Red Hat This talk will introduce the workflows and concerns of data scientists and machine learning engineers and demonstrate how to make Kubernetes a powerhouse for intelligent applications. We’ll show how community projects like Kubeflow and radanalytics.io support the entire intelligent application development lifecycle. We’ll cover several key benefits of Kubernetes for a data scientist’s workflow, from experiment design to publishing results. You’ll see how well scale-out...

Integrate Cloudera's Apache Impala implementation as a Data Source in Red Hat's JBoss Data Virtualization. The goal of this post is to import data from a Cloudera Impala instance, manipulate it and expose that data as a data service

After Unlock your Hadoop data with Hortonworks and Red Hat JBoss Data Virtualization episode, let's continue the journey with another "Apache Hadoop" episode of the series: "Unlock your [….] data with Red Hat JBoss Data Virtualization." Through this blog series, we will look at how to connect Red Hat JBoss Data Virtualization (JDV) to different and heterogeneous data sources. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable...

Recently, the focus on the continuous delivery of value has created a lot of interest in microservices, CI/CD, and containers. The idea is that microservices are small and well defined enough to enable rapid innovation, automated testing, and frequent deployments with minimal risk. This is made possible by adopting continuous integration and continuous delivery pipelines. CI/CD requires the ability to quickly, easily, reliably, and automatically create and tear down complete execution environments. Linux containers address this need by creating lightweight...

Part II of the OpenShift.io Developer Tools overview follows on the heels of the introduction session , this time presented by Pete Muir and Gorkem Ercan. In this session, we are taken through the integrated OpenShift.io Eclipse Che IDE. What is a Cloud Workspace? One of the fundamental problems with today's development methodology is that development happens on your laptop-- in a completely different environment from production. This is one of the major sources of bugs as your software is...

Yesterday, at Red Hat Summit , Red Hat announced OpenShift.io . OpenShift.io is the next generation OpenShift platform, based on OpenShift 3, for building and running applications in the cloud. It gives you complete control of your application's lifecycle, from build to production-- regardless of deploying from source or running a pre-built container. In the Developer tools, Overview and Roadmap Part I summit session, Todd Mancini, Peter Muir, and James Strachan take a packed house through an introduction to OpenShift.io...

Today's announcement of Red Hat OpenShift.io was followed by a full day of developer toolset Summit sessions. These were presented by the OpenShift.io product development team and covered some truly amazing OpenShift.io features. While there are too many features to cover in a single blog post, these were my top 7 items. 1. A Kanban board that is actually useful OpenShift.io is built from the ground up for development teams to rapidly release software. This is one of the primary...

An in-memory data grid is a distributed data management platform for application data that: Uses memory (RAM) to store information for very fast, low-latency response time, and very high throughput. Keeps copies of that information synchronized across multiple servers for continuous availability, information reliability, and linear scalability. Can be used as distributed cache, NoSQL database, event broker, compute grid, and Apache Spark data store. The technical advantages of an in-memory data grid (IMDGs) provide business benefits in the form of...

Welcome to part 4 of Red Hat JBoss Data Virtualization (JDV) running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data. Bringing data from outside to inside the...

Introduction: A feature of OpenShift is jobs and today I will be explaining how you can use jobs to run your spark machine, learning data science applications against Spark running on OpenShift. You can run jobs as a batch or scheduled, which provides cron like functionality. If jobs fail, by default OpenShift will retry the job creation again. At the end of this article, I have a video demonstration of running spark jobs from OpenShift templates against Spark running on...

Welcome to part 3 of Red Hat JBoss Data Virtualization (JDV) running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data from outside to inside the...

Article

Red Hat JBoss Data Virtualization (JDV) is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data from outside to inside the PaaS Breaking up monolithic data sources virtually for a...

We are happy to announce the availability of Red Hat JBoss Data Virtualization (JDV) 6.3 image running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data...

Welcome to this first episode of this series: "Unlock your [....] data with Red Hat JBoss Data Virtualization (JDV)." This post will guide you through an example of connecting to a Hadoop source via the Hive2 driver, using Teiid Designer. In this example we will demonstrate connection to a local Hadoop source. We're using the Hortonworks 2.5 Sandbox running in Virtual Box for our source, but you can connect to another Hortonwork source if you wish using the same steps...

There’s a whole host of GUI tools to connect to MongoDB databases and browse, however despite a steeper learning curve, I’ve always found myself more productive using a command line interface (CLI). Then, there’s that moment when something has gone wrong on the database server, and we need to SSH 4-levels deep in order to debug a problem with a database. Sometimes, there’s no other option available, and this makes familiarity with the CLI invaluable. I learn best by example...

Article

At DevNation, Red Hat's Galder Zamarreño gave a talk with a live demo, Building reactive applications with Node.js and Red Hat JBoss Data Grid. The demo consisted of building an event-based three tier web application using JBoss Data Grid (JDG) as the data layer, an event manager running on Node.js, and a web client. Recently, support for Node.js clients was added to JDG, opening up the performance of a horizontally scalable in-memory data grid, to reactive web and mobile applications...

A few days ago I had a rant about the misuse and misunderstanding of REST (typically HTTP) for microservices. To summarize, a few people/groups have been suggesting that you cannot do asynchronous interactions with HTTP, and that as a result of using HTTP you cannot break down a monolithic application into more agile microservices. The fact that most people refer to REST when they really mean HTTP is also a source of personal frustration, because by this stage experienced people...

Containerizing things is particularly popular these days. Today we'll talk about the idioms we can use for containerization, and specifically play with apache spark and cassandra in our use case for creating easily deployed, immutable microservices. Note: This post is done using centos7 as a base for the containers, but these same recipes will apply with RHEL and Fedora base images. There are a few different ways to build a container. For example, for beginners, you can build a container...

Agility is the key for benefiting from the use of Big Data for operational excellence and improved profitability. Ovum Research finds that organizations that take an iterative approach to refining analytic models, consolidating data sources, and transitioning to the cloud, tend to find more success with Big Data. Attend this webinar to learn how to: Consolidate your data sources Build open, flexible Big Data ecosystems Find success with Big Data Register for this webinar to learn proven methods on how...

Article

Abstract: Historically, the term "Hadoop" has been considered synonymous with its core technologies: MapReduce and the Hadoop Distributed File System (HDFS). But today the definition of Hadoop is rapidly evolving. The Hadoop community is generalizing the application runtime model beyond MapReduce. On the storage front, we're seeing the emergence of many alternative Hadoop-compatible file systems. Red Hat has built an interface layer for its Red Hat Storage Server product. This complete implementation of the Hadoop file system interface lets Hadoop-related...

AI/ML

Products

Build

Quicklinks

Communicate

RED HAT DEVELOPER

Red Hat legal and privacy links

Red Hat legal and privacy links

Report a website issue