A proper (logical) alternative for unicode strings;
<?php
function substr_unicode($str, $s, $l = null) {
return join("", array_slice(
preg_split("//u", $str, -1, PREG_SPLIT_NO_EMPTY), $s, $l));
}
$str = "Büyük";
$s = 0; // start from "0" (nth) char
$l = 3; // get "3" chars
echo substr($str, $s, $l) ."\n"; // Bü
echo mb_substr($str, $s, $l) ."\n"; // Bü
echo substr_unicode($str, $s, $l); // Büy
?>
mb_substr
(PHP 4 >= 4.0.6, PHP 5)
mb_substr — Obtiene parte de una cadena
Descripción
string mb_substr
( string
$str
, int $start
[, int $length
[, string $encoding
]] )
Ejecuta una operación
substr() multi-byte de forma segura basada
en el número de caracteres. La posición se cuenta
desde el principio de
str. La posición del primer caracter es
0. La posición del segundo es 1, etcétera.
Parámetros
-
str -
La cadena de donde extraer la parte deseada.
-
start -
La posición del primer caracter a usar de
str. -
length -
El número máximo de caracteres a usar de
str. -
encoding -
El parámetro
encodinges la codificación de caracteres. Si es omitido, será usado el valor de la codificación de caracteres interna.
Valores devueltos
mb_substr() devuelve una parte de
str especificada por los parametros
start y
length.
Ver también
- mb_strcut() - Get part of string
- mb_internal_encoding() - Set/Get internal character encoding
qeremy [atta] gmail [dotta] com
27-Feb-2012 03:58
p dot assenov at aip-solutions dot com
02-Dec-2011 09:17
I'm trying to capitalize only the first character of the string and tried some of the examples above but they didn't work. It seems mb_substr() cannot calculate the length of the string in multi-byte encoding (UTF-8) and it should be set explicitly. Here is the corrected version:
<?php
function mb_ucfirst($str, $enc = 'utf-8') {
return mb_strtoupper(mb_substr($str, 0, 1, $enc), $enc).mb_substr($str, 1, mb_strlen($str, $enc), $enc);
}
?>
cheers!
levani9191 at gmail dot com
18-Jul-2010 03:37
A simple code that check if the latest symbol in the string is a question mark and adds one if it doesn't...
<?php $string = (mb_substr($string, -1, 1, 'UTF-8') != '?') ? $string.'?' : $string; ?>
Anonymous
26-Feb-2010 05:15
If start is negative, the returned string will start at the start'th character from the end of string
dziamid at gmail dot com
06-Feb-2009 08:27
Here is my solution to highlighting search queries in multibyte text:
<?php
function mb_highlight($data, $query, $ins_before, $ins_after)
{
$result = '';
while (($poz = mb_strpos(mb_strtolower($data), mb_strtolower($query))) !== false)
{
$query_len = mb_strlen ($query);
$result .= mb_substr ($data, 0, $poz).
$ins_before.
mb_substr ($data, $poz, $query_len).
$ins_after;
$data = mb_substr ($data, $poz+$query_len);
}
return $result;
}
?>
Enjoy!
[EDIT BY danbrown AT php DOT net: Reclassified to a more appropriate function manual page.]
projektas at gmail dot com
21-Oct-2008 06:29
First letter in upper case <hr />
<?php
header ('Content-type: text/html; charset=utf-8');
if (isset($_POST['check']) && !empty($_POST['check'])) {
echo htmlspecialchars(ucfirst_utf8($_POST['check']));
} else {
echo htmlspecialchars(ucfirst_utf8('Žąsinų'));
}
function ucfirst_utf8($str) {
if (mb_check_encoding($str,'UTF-8')) {
$first = mb_substr(
mb_strtoupper($str, "utf-8"),0,1,'utf-8'
);
return $first.mb_substr(
mb_strtolower($str,"utf-8"),1,mb_strlen($str),'utf-8'
);
} else {
return $str;
}
}
?>
<form method="post" action="" >
<input type="input" name="check" />
<input type="submit" />
</form>
Silvan
01-Sep-2007 03:30
Passing null as length will not make mb_substr use it's default, instead it will interpret it as 0.
<?php
mb_substr($str,$start,null,$encoding); //Returns '' (empty string) just like substr()
?>
Instead use:
<?php
mb_substr($str,$start,mb_strlen($str),$encoding);
?>
xiaogil at yahoo dot fr
02-Aug-2005 08:33
Thanks Darien from /freenode #php for the following example (a little bit changed).
It just prints the 6th character of $string.
You can replace the digits by the same in japanese, chinese or whatever language to make a test, it works perfect.
<?php
mb_internal_encoding("UTF-8");
$string = "0123456789";
$mystring = mb_substr($string,5,1);
echo $mystring;
?>
(I couldn't replace 0123456789 by chinese numbers for example here, because it's automatically converted into latin digits on this website, look :
零一二三四
五六七八九)
gilv
drraf at tlen dot pl
23-Feb-2005 06:44
Note: If borders are out of string - mb_string() returns empty _string_, when function substr() returns _boolean_ false in this case.
Keep this in mind when using "===" comparisions.
Example code:
<?php
var_dump( substr( 'abc', 5, 2 ) ); // returns "false"
var_dump( mb_substr( 'abc', 5, 2 ) ); // returns ""
?>
It's especially confusing when using mbstring with function overloading turned on.
