Involved Source Files Package encoding defines an interface for character encodings, such as Shift
JIS and Windows 1252, that can convert to and from UTF-8.
Encoding implementations are provided in other packages, such as
golang.org/x/text/encoding/charmap and
golang.org/x/text/encoding/japanese.
Code Examples
package main
import (
"io"
"os"
"strings"
"golang.org/x/text/encoding/charmap"
)
func main() {
sr := strings.NewReader("Gar\xe7on !")
tr := charmap.Windows1252.NewDecoder().Reader(sr)
io.Copy(os.Stdout, tr)
}
package main
import (
"fmt"
"golang.org/x/text/encoding"
"golang.org/x/text/encoding/unicode"
"golang.org/x/text/transform"
)
func main() {
for i := 0; i < 2; i++ {
var transformer transform.Transformer
transformer = unicode.UTF16(unicode.BigEndian, unicode.IgnoreBOM).NewEncoder()
if i == 1 {
transformer = transform.Chain(encoding.UTF8Validator, transformer)
}
dst := make([]byte, 256)
src := []byte("abc\xffxyz") // src is invalid UTF-8.
nDst, nSrc, err := transformer.Transform(dst, src, true)
fmt.Printf("i=%d: produced %q, consumed %q, error %v\n",
i, dst[:nDst], src[:nSrc], err)
}
}
Package-Level Type Names (total 3)
/* sort by: | */
A Decoder converts bytes to UTF-8. It implements transform.Transformer.
Transforming source bytes that are not of that encoding will not result in an
error per se. Each byte that cannot be transcoded will be represented in the
output by the UTF-8 encoding of '\uFFFD', the replacement rune.Transformertransform.Transformer Bytes converts the given encoded bytes to UTF-8. It returns the converted
bytes or nil, err if any error occurred. Reader wraps another Reader to decode its bytes.
The Decoder may not be used for any other operation as long as the returned
Reader is in use. Reset resets the state and allows a Transformer to be reused. String converts the given encoded string to UTF-8. It returns the converted
string or "", err if any error occurred. Transform writes to dst the transformed bytes read from src, and
returns the number of dst bytes written and src bytes read. The
atEOF argument tells whether src represents the last bytes of the
input.
Callers should always process the nDst bytes produced and account
for the nSrc bytes consumed before considering the error err.
A nil error means that all of the transformed bytes (whether freshly
transformed from src or left over from previous Transform calls)
were written to dst. A nil error can be returned regardless of
whether atEOF is true. If err is nil then nSrc must equal len(src);
the converse is not necessarily true.
ErrShortDst means that dst was too short to receive all of the
transformed bytes. ErrShortSrc means that src had insufficient data
to complete the transformation. If both conditions apply, then
either error may be returned. Other than the error conditions listed
here, implementations are free to report other errors that arise.
Decoder : golang.org/x/text/transform.Transformer
Decoder : vendor/golang.org/x/text/transform.Transformer
func Encoding.NewDecoder() *Decoder
func golang.org/x/text/encoding/charmap.(*Charmap).NewDecoder() *Decoder
func golang.org/x/text/encoding/internal.FuncEncoding.NewDecoder() *Decoder
func golang.org/x/text/encoding/internal.(*SimpleEncoding).NewDecoder() *Decoder
An Encoder converts bytes from UTF-8. It implements transform.Transformer.
Each rune that cannot be transcoded will result in an error. In this case,
the transform will consume all source byte up to, not including the offending
rune. Transforming source bytes that are not valid UTF-8 will be replaced by
`\uFFFD`. To return early with an error instead, use transform.Chain to
preprocess the data with a UTF8Validator.Transformertransform.Transformer Bytes converts bytes from UTF-8. It returns the converted bytes or nil, err if
any error occurred. Reset resets the state and allows a Transformer to be reused. String converts a string from UTF-8. It returns the converted string or
"", err if any error occurred. Transform writes to dst the transformed bytes read from src, and
returns the number of dst bytes written and src bytes read. The
atEOF argument tells whether src represents the last bytes of the
input.
Callers should always process the nDst bytes produced and account
for the nSrc bytes consumed before considering the error err.
A nil error means that all of the transformed bytes (whether freshly
transformed from src or left over from previous Transform calls)
were written to dst. A nil error can be returned regardless of
whether atEOF is true. If err is nil then nSrc must equal len(src);
the converse is not necessarily true.
ErrShortDst means that dst was too short to receive all of the
transformed bytes. ErrShortSrc means that src had insufficient data
to complete the transformation. If both conditions apply, then
either error may be returned. Other than the error conditions listed
here, implementations are free to report other errors that arise. Writer wraps another Writer to encode its UTF-8 output.
The Encoder may not be used for any other operation as long as the returned
Writer is in use.
Encoder : golang.org/x/text/transform.Transformer
Encoder : vendor/golang.org/x/text/transform.Transformer
func HTMLEscapeUnsupported(e *Encoder) *Encoder
func ReplaceUnsupported(e *Encoder) *Encoder
func Encoding.NewEncoder() *Encoder
func golang.org/x/text/encoding/charmap.(*Charmap).NewEncoder() *Encoder
func golang.org/x/text/encoding/internal.FuncEncoding.NewEncoder() *Encoder
func golang.org/x/text/encoding/internal.(*SimpleEncoding).NewEncoder() *Encoder
func HTMLEscapeUnsupported(e *Encoder) *Encoder
func ReplaceUnsupported(e *Encoder) *Encoder
Encoding is a character set encoding that can be transformed to and from
UTF-8. NewDecoder returns a Decoder. NewEncoder returns an Encoder.
*golang.org/x/text/encoding/charmap.Charmap
golang.org/x/text/encoding/internal.Encoding
golang.org/x/text/encoding/internal.FuncEncoding
*golang.org/x/text/encoding/internal.SimpleEncoding
func golang.org/x/text/encoding/htmlindex.Get(name string) (Encoding, error)
func golang.org/x/text/encoding/unicode.UTF16(e unicode.Endianness, b unicode.BOMPolicy) Encoding
func golang.org/x/net/html/charset.DetermineEncoding(content []byte, contentType string) (e Encoding, name string, certain bool)
func golang.org/x/net/html/charset.Lookup(label string) (e Encoding, name string)
func golang.org/x/text/encoding/htmlindex.Name(e Encoding) (string, error)
var Nop
var Replacement
var golang.org/x/text/encoding/charmap.ISO8859_6E
var golang.org/x/text/encoding/charmap.ISO8859_6I
var golang.org/x/text/encoding/charmap.ISO8859_8E
var golang.org/x/text/encoding/charmap.ISO8859_8I
var golang.org/x/text/encoding/japanese.EUCJP
var golang.org/x/text/encoding/japanese.ISO2022JP
var golang.org/x/text/encoding/japanese.ShiftJIS
var golang.org/x/text/encoding/korean.EUCKR
var golang.org/x/text/encoding/simplifiedchinese.GB18030
var golang.org/x/text/encoding/simplifiedchinese.GBK
var golang.org/x/text/encoding/simplifiedchinese.HZGB2312
var golang.org/x/text/encoding/traditionalchinese.Big5
var golang.org/x/text/encoding/unicode.UTF8
var golang.org/x/text/encoding/unicode.UTF8BOM
Package-Level Functions (total 2)
HTMLEscapeUnsupported wraps encoders to replace source runes outside the
repertoire of the destination encoding with HTML escape sequences.
This wrapper exists to comply to URL and HTML forms requiring a
non-terminating legacy encoder. The produced sequences may lead to data
loss as they are indistinguishable from legitimate input. To avoid this
issue, use UTF-8 encodings whenever possible.
ReplaceUnsupported wraps encoders to replace source runes outside the
repertoire of the destination encoding with an encoding-specific
replacement.
This wrapper is only provided for backwards compatibility and legacy
handling. Its use is strongly discouraged. Use UTF-8 whenever possible.
Package-Level Variables (total 4)
ErrInvalidUTF8 means that a transformer encountered invalid UTF-8.
Nop is the nop encoding. Its transformed bytes are the same as the source
bytes; it does not replace invalid UTF-8 sequences.
Replacement is the replacement encoding. Decoding from the replacement
encoding yields a single '\uFFFD' replacement rune. Encoding from UTF-8 to
the replacement encoding yields the same as the source bytes except that
invalid UTF-8 is converted to '\uFFFD'.
It is defined at http://encoding.spec.whatwg.org/#replacement
UTF8Validator is a transformer that returns ErrInvalidUTF8 on the first
input byte that is not valid UTF-8.
Package-Level Constants (only one)
ASCIISub is the ASCII substitute character, as recommended by
https://unicode.org/reports/tr36/#Text_Comparison
The pages are generated with Goldsv0.6.7. (GOOS=linux GOARCH=amd64)
Golds is a Go 101 project developed by Tapir Liu.
PR and bug reports are welcome and can be submitted to the issue list.
Please follow @Go100and1 (reachable from the left QR code) to get the latest news of Golds.