Renderscript offers a high performance computation API at the native level that you write in C (C99 standard). Renderscript gives your apps the ability to run operations with automatic parallelization across all available processor cores. It also supports different types of processors such as the CPU, GPU or DSP. Renderscript is useful for apps that do image processing, mathematical modeling, or any operations that require lots of mathematical computation.
In addition, you have access to all of these features without having to write code to support different architectures or a different amount of processing cores. You also do not need to recompile your application for different processor types, because Renderscript code is compiled on the device at runtime.
Deprecation Notice: Earlier versions of Renderscript included
an experimental graphics engine component. This component
is now deprecated as of Android 4.1 (most of the APIs in rs_graphics.rsh
and the corresponding APIs in android.renderscript
).
If you have apps that render graphics with Renderscript, we highly
recommend you convert your code to another Android graphics rendering option.
Renderscript System Overview
The Renderscript runtime operates at the native level and still needs to communicate with the Android VM, so the way a Renderscript application is set up is different from a pure VM application. An application that uses Renderscript is still a traditional Android application that runs in the VM, but you write Renderscript code for the parts of your program that require it. No matter what you use it for, Renderscript remains platform independent, so you do not have to target multiple architectures (for example, ARM v5, ARM v7, x86).
The Renderscript system adopts a control and slave architecture where the low-level Renderscript runtime code is controlled by the higher level Android system that runs in a virtual machine (VM). The Android VM still retains all control of memory management and binds memory that it allocates to the Renderscript runtime, so the Renderscript code can access it. The Android framework makes asynchronous calls to Renderscript, and the calls are placed in a message queue and processed as soon as possible. Figure 1 shows how the Renderscript system is structured.
When using Renderscript, there are three layers of APIs that enable communication between the Renderscript runtime and Android framework code:
- The Renderscript runtime APIs allow you to do the computation that is required by your application.
- The reflected layer APIs are a set of classes that are reflected from your Renderscript runtime code. It is basically a wrapper around the Renderscript code that allows the Android framework to interact with the Renderscript runtime. The Android build tools automatically generate the classes for this layer during the build process. These classes eliminate the need to write JNI glue code, like with the NDK.
- The Android framework layer calls the reflected layer to access the Renderscript runtime.
Because of the way Renderscript is structured, the main advantages are:
- Portability: Renderscript is designed to run on many types of devices with different processor (CPU, GPU, and DSP for instance) architectures. It supports all of these architectures without having to target each device, because the code is compiled and cached on the device at runtime.
- Performance: Renderscript provides a high performance computation API with seamless parallelization across the amount of cores on the device.
- Usability: Renderscript simplifies development when possible, such as eliminating JNI glue code.
The main disadvantages are:
- Development complexity: Renderscript introduces a new set of APIs that you have to learn.
- Debugging visibility: Renderscript can potentially execute (planned feature for later releases) on processors other than the main CPU (such as the GPU), so if this occurs, debugging becomes more difficult.
For a more detailed explanation of how all of these layers work together, see Advanced Renderscript.
Creating a Renderscript
Renderscripts scale to the amount of
processing cores available on the device. This is enabled through a function named
rsForEach()
(or the forEach_root()
method at the Android framework level).
that automatically partitions work across available processing cores on the device.
For now, Renderscript can only take advantage of CPU
cores, but in the future, they can potentially run on other types of processors such as GPUs and
DSPs.
Implementing a Renderscript involves creating a .rs
file that contains
your Renderscript code and calling it at the Android framework level with the
forEach_root()
or at the Renderscript runtime level with the
rsForEach()
function. The following diagram describes how a typical
Renderscript is set up:
The following sections describe how to create a simple Renderscript and use it in an Android application. This example uses the HelloCompute Renderscript sample that is provided in the SDK as a guide (some code has been modified from its original form for simplicity).
Creating the Renderscript file
Your Renderscript code resides in .rs
and .rsh
files in the
<project_root>/src/
directory. This code contains the computation logic
and declares all necessary variables and pointers.
Every .rs
file generally contains the following items:
- A pragma declaration (
#pragma rs java_package_name(package.name)
) that declares the package name of the.java
reflection of this Renderscript. - A pragma declaration (
#pragma version(1)
) that declares the version of Renderscript that you are using (1 is the only value for now). A
root()
function that is the main worker function. The root function is called by thersForEach
function, which allows the Renderscript code to be called and executed on multiple cores if they are available. Theroot()
function must returnvoid
and accept the following arguments:- Pointers to memory allocations that are used for the input and output of the Renderscript. Both of these pointers are required for Android 3.2 (API level 13) platform versions or older. Android 4.0 (API level 14) and later requires one or both of these allocations.
The following arguments are optional, but both must be supplied if you choose to use them:
- A pointer for user-defined data that the Renderscript might need to carry out computations in addition to the necessary allocations. This can be a pointer to a simple primitive or a more complex struct.
- The size of the user-defined data.
- An optional
init()
function. This allows you to do any initialization before theroot()
function runs, such as initializing variables. This function runs once and is called automatically when the Renderscript starts, before anything else in your Renderscript. - Any variables, pointers, and structures that you wish to use in your Renderscript code (can
be declared in
.rsh
files if desired)
The following code shows how the mono.rs file is implemented:
#pragma version(1) #pragma rs java_package_name(com.example.android.rs.hellocompute) //multipliers to convert a RGB colors to black and white const static float3 gMonoMult = {0.299f, 0.587f, 0.114f}; void root(const uchar4 *v_in, uchar4 *v_out) { //unpack a color to a float4 float4 f4 = rsUnpackColor8888(*v_in); //take the dot product of the color and the multiplier float3 mono = dot(f4.rgb, gMonoMult); //repack the float to a color *v_out = rsPackColorTo8888(mono); }
Calling the Renderscript code
You can call the Renderscript from your Android framework code by
creating a Renderscript object by instantiating the (ScriptC_script_name
)
class. This class contains a method, forEach_root()
, that lets you invoke
rsForEach
. You give it the same parameters that you would if you were invoking it
at the Renderscript runtime level. This technique allows your Android application to offload
intensive mathematical calculations to Renderscript. See the HelloCompute sample to see
how a simple Android application can utilize Renderscript.
To call Renderscript at the Android framework level:
- Allocate memory that is needed by the Renderscript in your Android framework code.
You need an input and output
Allocation
for Android 3.2 (API level 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only one or bothAllocation
s. - Create an instance of the
ScriptC_script_name
class. - Call
forEach_root()
, passing in the allocations, the Renderscript, and any optional user-defined data. The output allocation will contain the output of the Renderscript.
The following example, taken from the HelloCompute sample, processes
a bitmap and outputs a black and white version of it. The
createScript()
method carries out the steps described previously. This method calls the
Renderscript, mono.rs
, passing in memory allocations that store the bitmap to be processed
as well as the eventual output bitmap. It then displays the processed bitmap onto the screen:
package com.example.android.rs.hellocompute; import android.app.Activity; import android.os.Bundle; import android.graphics.BitmapFactory; import android.graphics.Bitmap; import android.renderscript.RenderScript; import android.renderscript.Allocation; import android.widget.ImageView; public class HelloCompute extends Activity { private Bitmap mBitmapIn; private Bitmap mBitmapOut; private RenderScript mRS; private Allocation mInAllocation; private Allocation mOutAllocation; private ScriptC_mono mScript; @Override protected void onCreate(Bundle savedInstanceState) { super.onCreate(savedInstanceState); setContentView(R.layout.main); mBitmapIn = loadBitmap(R.drawable.data); mBitmapOut = Bitmap.createBitmap(mBitmapIn.getWidth(), mBitmapIn.getHeight(), mBitmapIn.getConfig()); ImageView in = (ImageView) findViewById(R.id.displayin); in.setImageBitmap(mBitmapIn); ImageView out = (ImageView) findViewById(R.id.displayout); out.setImageBitmap(mBitmapOut); createScript(); } private void createScript() { mRS = RenderScript.create(this); mInAllocation = Allocation.createFromBitmap(mRS, mBitmapIn, Allocation.MipmapControl.MIPMAP_NONE, Allocation.USAGE_SCRIPT); mOutAllocation = Allocation.createTyped(mRS, mInAllocation.getType()); mScript = new ScriptC_mono(mRS, getResources(), R.raw.mono); mScript.forEach_root(mInAllocation, mOutAllocation); mOutAllocation.copyTo(mBitmapOut); } private Bitmap loadBitmap(int resource) { final BitmapFactory.Options options = new BitmapFactory.Options(); options.inPreferredConfig = Bitmap.Config.ARGB_8888; return BitmapFactory.decodeResource(getResources(), resource, options); } }
To call Renderscript from another Renderscript file:
- Allocate memory that is needed by the Renderscript in your Android framework code.
You need an input and output
Allocation
for Android 3.2 (API level 13) platform versions and older. The Android 4.0 (API level 14) platform version requires only one or bothAllocation
s. - Call
rsForEach()
, passing in the allocations and any optional user-defined data. The output allocation will contain the output of the Renderscript.
rs_script script; rs_allocation in_allocation; rs_allocation out_allocation; UserData_t data; ... rsForEach(script, in_allocation, out_allocation, &data, sizeof(data));
In this example, assume that the script and memory allocations have already been
allocated and bound at the Android framework level and that UserData_t
is a struct
declared previously. Passing a pointer to a struct and the size of the struct to rsForEach
is optional, but useful if your Renderscript requires additional information other than
the necessary memory allocations.
Setting floating point precision
You can define the floating point precision required by your compute algorithms. This is useful if you require less precision than the IEEE 754-2008 standard (used by default). You can define the floating-point precision level of your script with the following pragmas:
#pragma rs_fp_full
(default if nothing is specified): For apps that require floating point precision as outlined by the IEEE 754-2008 standard.#pragma rs_fp_relaxed
- For apps that don’t require strict IEEE 754-2008 compliance and can tolerate less precision. This mode enables flush-to-zero for denorms and round-towards-zero.#pragma rs_fp_imprecise
- For apps that don’t have stringent precision requirements. This mode enables everything inrs_fp_relaxed
along with the following:- Operations resulting in -0.0 can return +0.0 instead.
- Operations on INF and NAN are undefined.