AI/ML

Article Thumbnail
Article

sig-big-data: Apache Spark and Apache Airflow on Kubernetes

David Millsaps

This presentation will cover two projects from sig-big-data: Apache Spark on Kubernetes and Apache Airflow on Kubernetes. We will give an overview of the current state and present the roadmap of both projects, and give attendees opportunities to ask questions and provide feedback on roadmaps.

kubernetes-data-science
Article

Why Data Scientists Love Kubernetes

David Millsaps

https://www.youtube.com/watch?v=frQeK8xo9Ls Why Data Scientists Love Kubernetes - Sophie Watson & William Benton , Red Hat This talk will introduce the workflows and concerns of data scientists and machine learning engineers and demonstrate how to make Kubernetes a powerhouse for intelligent applications. We’ll show how community projects like Kubeflow and radanalytics.io support the entire intelligent application development lifecycle. We’ll cover several key benefits of Kubernetes for a data scientist’s workflow, from experiment design to publishing results. You’ll see how well scale-out...

JBoss Data Virtualization: Integrating with Impala on Cloudera
Article

JBoss Data Virtualization: Integrating with Impala on Cloudera

Mike Echevarria

Integrate Cloudera's Apache Impala implementation as a Data Source in Red Hat's JBoss Data Virtualization. The goal of this post is to import data from a Cloudera Impala instance, manipulate it and expose that data as a data service

Red Hat JBoss Data Virtualization
Article

Unlock Your Cloudera Data with Red Hat JBoss Data Virtualization

Madou Coulibaly

After Unlock your Hadoop data with Hortonworks and Red Hat JBoss Data Virtualization episode, let's continue the journey with another "Apache Hadoop" episode of the series: "Unlock your [….] data with Red Hat JBoss Data Virtualization." Through this blog series, we will look at how to connect Red Hat JBoss Data Virtualization (JDV) to different and heterogeneous data sources. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable...

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying cloud-native applications.
Article

Achieving Deployment Excellence with Red Hat OpenShift.io

Rob Terzi

Recently, the focus on the continuous delivery of value has created a lot of interest in microservices, CI/CD, and containers. The idea is that microservices are small and well defined enough to enable rapid innovation, automated testing, and frequent deployments with minimal risk. This is made possible by adopting continuous integration and continuous delivery pipelines. CI/CD requires the ability to quickly, easily, reliably, and automatically create and tear down complete execution environments. Linux containers address this need by creating lightweight...

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying cloud-native applications.
Article

OpenShift.io Developer Tools Overview - Summit 2017 - The Power of Cloud Workspaces - Part 2

Brian Atkisson

Part II of the OpenShift.io Developer Tools overview follows on the heels of the introduction session , this time presented by Pete Muir and Gorkem Ercan. In this session, we are taken through the integrated OpenShift.io Eclipse Che IDE. What is a Cloud Workspace? One of the fundamental problems with today's development methodology is that development happens on your laptop-- in a completely different environment from production. This is one of the major sources of bugs as your software is...

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying cloud-native applications.
Article

OpenShift.io The Gathering - Summit 2017 - Developer Tools, Overview and Roadmap Part I

Brian Atkisson

Yesterday, at Red Hat Summit , Red Hat announced OpenShift.io . OpenShift.io is the next generation OpenShift platform, based on OpenShift 3, for building and running applications in the cloud. It gives you complete control of your application's lifecycle, from build to production-- regardless of deploying from source or running a pre-built container. In the Developer tools, Overview and Roadmap Part I summit session, Todd Mancini, Peter Muir, and James Strachan take a packed house through an introduction to OpenShift.io...

Red Hat OpenShift.io is an end-to-end development environment for planning, building and deploying cloud-native applications.
Article

7 Freaking Awesome things about OpenShift.io

Brian Atkisson

Today's announcement of Red Hat OpenShift.io was followed by a full day of developer toolset Summit sessions. These were presented by the OpenShift.io product development team and covered some truly amazing OpenShift.io features. While there are too many features to cover in a single blog post, these were my top 7 items. 1. A Kanban board that is actually useful OpenShift.io is built from the ground up for development teams to rapidly release software. This is one of the primary...

Red Hat JBOSS Data Grid
Article

Offload your database data into an in-memory data grid for fast processing made easy

Cojan van Ballegooijen

An in-memory data grid is a distributed data management platform for application data that: Uses memory (RAM) to store information for very fast, low-latency response time, and very high throughput. Keeps copies of that information synchronized across multiple servers for continuous availability, information reliability, and linear scalability. Can be used as distributed cache, NoSQL database, event broker, compute grid, and Apache Spark data store. The technical advantages of an in-memory data grid (IMDGs) provide business benefits in the form of...

JBoss Data Virtualization: Integrating with Impala on Cloudera
Article

Red Hat JBoss Data Virtualization on OpenShift: Part 4 - Bringing data from outside to inside the PaaS

Cojan van Ballegooijen

Welcome to part 4 of Red Hat JBoss Data Virtualization (JDV) running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data. Bringing data from outside to inside the...

Red Hat OpenShift
Article

Running Spark Jobs On OpenShift

Zak Hassan

Introduction: A feature of OpenShift is jobs and today I will be explaining how you can use jobs to run your spark machine, learning data science applications against Spark running on OpenShift. You can run jobs as a batch or scheduled, which provides cron like functionality. If jobs fail, by default OpenShift will retry the job creation again. At the end of this article, I have a video demonstration of running spark jobs from OpenShift templates against Spark running on...

Article Thumbnail
Article

Red Hat JBoss Data Virtualization on OpenShift: Part 3 – Data federation

Cojan van Ballegooijen

Welcome to part 3 of Red Hat JBoss Data Virtualization (JDV) running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data from outside to inside the...

RedHat Shadowman Logo
Article

Red Hat JBoss Data Virtualization on OpenShift: Part 1 - Getting started

Cojan van Ballegooijen

Red Hat JBoss Data Virtualization (JDV) is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data from outside to inside the PaaS Breaking up monolithic data sources virtually for a...

Article Thumbnail
Article

Announcement: Red Hat JBoss Data Virtualization on OpenShift now available

Cojan van Ballegooijen

We are happy to announce the availability of Red Hat JBoss Data Virtualization (JDV) 6.3 image running on OpenShift. JDV is a lean, virtual data integration solution that unlocks trapped data and delivers it as easily consumable, unified, and actionable information. JDV makes data spread across physically diverse systems such as multiple databases, XML files, and Hadoop systems appear as a set of tables in a local database. When deployed on OpenShift, JDV enables: Service enabling your data Bringing data...

Article Thumbnail
Article

Unlock your Hadoop data with Hortonworks and Red Hat JBoss Data Virtualization

Cojan van Ballegooijen

Welcome to this first episode of this series: "Unlock your [....] data with Red Hat JBoss Data Virtualization (JDV)." This post will guide you through an example of connecting to a Hadoop source via the Hive2 driver, using Teiid Designer. In this example we will demonstrate connection to a local Hadoop source. We're using the Hortonworks 2.5 Sandbox running in Virtual Box for our source, but you can connect to another Hortonwork source if you wish using the same steps...

Article Thumbnail
Article

A Mongo Shell Cheat Sheet

Cian Clarke

There’s a whole host of GUI tools to connect to MongoDB databases and browse, however despite a steeper learning curve, I’ve always found myself more productive using a command line interface (CLI). Then, there’s that moment when something has gone wrong on the database server, and we need to SSH 4-levels deep in order to debug a problem with a database. Sometimes, there’s no other option available, and this makes familiarity with the CLI invaluable. I learn best by example...

Node JS logo
Article

DevNation Live Blog: Building Reactive Applications with Node.js and Red Hat JBoss Data Grid

Rob Terzi

At DevNation, Red Hat's Galder Zamarreño gave a talk with a live demo, Building reactive applications with Node.js and Red Hat JBoss Data Grid. The demo consisted of building an event-based three tier web application using JBoss Data Grid (JDG) as the data layer, an event manager running on Node.js, and a web client. Recently, support for Node.js clients was added to JDG, opening up the performance of a horizontally scalable in-memory data grid, to reactive web and mobile applications...

Article Thumbnail
Article

REST and microservices - breaking down the monolith step by asynchronous step

Mark Little

A few days ago I had a rant about the misuse and misunderstanding of REST (typically HTTP) for microservices. To summarize, a few people/groups have been suggesting that you cannot do asynchronous interactions with HTTP, and that as a result of using HTTP you cannot break down a monolithic application into more agile microservices. The fact that most people refer to REST when they really mean HTTP is also a source of personal frustration, because by this stage experienced people...

Article Thumbnail
Article

Microservice principles and Immutability - demonstrated with Apache Spark and Cassandra

jay vyas

Containerizing things is particularly popular these days. Today we'll talk about the idioms we can use for containerization, and specifically play with apache spark and cassandra in our use case for creating easily deployed, immutable microservices. Note: This post is done using centos7 as a base for the containers, but these same recipes will apply with RHEL and Fedora base images. There are a few different ways to build a container. For example, for beginners, you can build a container...

Red Hat Wimplicit
Article

Webinar: How to Stay Agile with Big Data: A Roadmap - 10 September

Mike Guerette

Agility is the key for benefiting from the use of Big Data for operational excellence and improved profitability. Ovum Research finds that organizations that take an iterative approach to refining analytic models, consolidating data sources, and transitioning to the cloud, tend to find more success with Big Data. Attend this webinar to learn how to: Consolidate your data sources Build open, flexible Big Data ecosystems Find success with Big Data Register for this webinar to learn proven methods on how...

DevNation logo
Article

DevNation 2014: Scott McClellan - Hadoop and Beyond

Mike Guerette

Abstract: Historically, the term "Hadoop" has been considered synonymous with its core technologies: MapReduce and the Hadoop Distributed File System (HDFS). But today the definition of Hadoop is rapidly evolving. The Hadoop community is generalizing the application runtime model beyond MapReduce. On the storage front, we're seeing the emergence of many alternative Hadoop-compatible file systems. Red Hat has built an interface layer for its Red Hat Storage Server product. This complete implementation of the Hadoop file system interface lets Hadoop-related...